NASA/Caltech

Software Engineering Intern

March 2021 - September 2021

Python
Pandas
NumPy
Django
Seaborn
AWS
Heroku
SSL
PostgreSQL
LaTeX

I joined Caltech in the Spring of 2021 as a Software Engineering Intern, working under the guidance of Dr. Davy Kirkpatrick and Federico Marocco . My research focused on detecting brown dwarfs, dim and small stars within our solar system. During my seven-month tenure, I contributed by cleaning data, developing a dashboard for data storage and filtering, fixing a memory leak in a charting program, and co-authoring a paper [1].

Data Cleaning

My first task was to clean two decades’ worth of data using Pandas and NumPy. The data had been handled by many over the years, leading to numerous inconsistencies—varying date formats, missing observation data, misclassified columns, and occasional notes mixed with data entries. While the task was complex, the functionality provided by Pandas and NumPy significantly streamlined the process.

Dashboard

Once the data was clean, the next step was to create a user-friendly interface for researchers to easily access it. Google Sheets proved to be slow and inefficient for searching specific parameters, especially with thousands of rows and hundreds of columns of data needing constant updates. Instead of forcing researchers to write SQL queries, I proposed and developed a Django-based internal dashboard. The dashboard offered extensive search features and a clean interface, allowing dynamic viewing of columns, CSV exports, and the ability to upload new datasets.

Dashboard
Shown here is the dashboard that I developed. Columns are dynamically viewable with the ability to export data into a CSV for further analysis.

After multiple iterations based on feedback, the final version allowed filtering on every column, integration with NASA’s API for star viewing, and exporting data into CSV format for further analysis. It also featured a CI/CD pipeline with test deployments on Heroku and stable releases on AWS. The dashboard remained in use for three years, until the project concluded in late 2023.

Memory Leak Fix

One of my final projects involved addressing a memory leak in Findercharts , a chart generation tool used to visualize stars and satellite images. The software, developed by Caltech researchers, was no longer actively maintained, and it began crashing after producing a single chart. After some debugging, I discovered that the issue stemmed from improperly handled FITS-encoded images, which are much larger and more complex than typical images. Given the looming deadline for the paper, I proposed an interim solution—a Bash process handler to run the graph generation and automatically discard processes after execution. This quick fix allowed the paper to be submitted on time.

Chart 1
Chart 2
Chart 3
Chart 4
Chart 5
These are examples of charts generated by the software. Images 2 and 3 showcase a brown dwarf star as seen with its green hue on the W3/W2/W1 band indicating the presence of methane. The remainder of the images don't show brown dwarfs but are still interesting to look at.

Writing the Paper

After the technical work was completed, I helped draft a paper for the Astronomical Journal. The paper was a collaboration between researchers at Caltech, NASA, other prestigious institutions, and citizen scientists. Written in LaTeX, I contributed many of the figures and graphs. Going through the peer review process and making revisions was an eye-opening experience that deepened my interest in academia.


  1. Kota, T., Kirkpatrick, J. D., Caselden, D., Marocco, F., Schneider, A. C., Gagné, J., Faherty, J. K., Meisner, A. M., Kuchner, M. J., Casewell, S., Kacholia, K., Bickle, T., Beaulieu, P., Colin, G., Hamlet, L. K., Schümann, J., & Tanner, C. (2022). Discovery of 16 New Members of the Solar Neighborhood Using Proper Motions from CatWISE2020. The Astronomical Journal, 163(3), 116. https://doi.org/10.3847/1538-3881/ac4713