Here is a list of projects I have worked on. You can click on any card to know more details.
Detection of Bars in Galaxies Using Deep Learning
Mining Observatory Logs Using Natural Language Processing
Kodaikanal Solar Observatory Digitized Data
Data Archiving Solutions for Biological Data
The CZTI Data Management System
The Catalina Real-time Transient Survey
The "Dark Skies for All" Portal
Leveraging Alfresco
QuickDB
OAD Volunteer Portal
Spectral Classification Using Deep Learning
OAD Volunteer Portal
Galaxies come in a wide variety of shapes. Some galaxies exhibit a bar like structure. Now, it turns out that this bar's gravitational field causes interesting changes in how a galaxy evolves. Thus it is very important fmor us to classify galaxies as barred / unbarred. And how does one do that? Well, you stare at the image and your brain tells you it is a barred galaxy. Okay, let's do that for the millions of galaxies observed by various astronomical surveys. Sounds like a job for deep learning!
I worked with a group where we explored the use of Deep Learning algorithms, specifically Convolutional Neural Networks to train a machine in cloassifying galaxies as barred or unbarred. The algorithm does a fantastic job with an accuracy of 94%. More details here.
Observatories go through a relatively standard process of planning, approval, commissioning and testing. Problems are encountered in all these stages. Scientists record and share their observations and solutions in the form of logs written in plain language. The value of a system which can mine these logs thus allowing newer observatories to benefit from the wisdom imbibed in them, is immense.
A system based on deep learning and Natural Language Processing was created to help mine the logs of the LIGO (Laser Interferometer Gravitational wave Observatories) projects. The result - Hey Ligo!
The Kodaikanal Solar observatory is operational since more than a century. Dedicated observers have been at work since 1904, imaging the sun using photographic plates. Given that the sun exhibits periodical changes such as sun spots over periods spanning decades, such a data set becomes immensely important. To make the data amenable to the sophisticated analysis computerized routines available, they must be present in digitized form. The folks at Indian Institute of Astrophysics (IIA, which now operates the Kodaikanal Observatory), have carried out an immense undertaking, lasting several years, to digitize these data.
As a part of my project, we decided to build an archive for them. We used modern web frameworks to design a data archive which would make data access easier for the scientific community. Novel features we included in this site include the following:
You can visit the site at this link.
I am also working with colleages from Regional Centre of Biotechnology, at Faridabad to design data storage and dissemination solutions for large data sets gathered using complex instruments such as the new X-Ray Diffraction facility at the European Sycnhrotron Research Facility. Generally, biological data have been produced in personal labs and thus the need for a centralized archive with appropriate search and download services has been scarce. However, the situation in Astronomy has been the opposite. Astronomers often have had to work in far-off locations in facilities hard to construct using a single lab's resources.
The idea of this ongoing work is to adopt battle tested techniques from Astronomy and apply them to help organize biological data better.
The Cadmium Zinc Telluride Image (CZTI) is one of five payloads onboard the Astrosat satellite, the first ever space telescope launched by India. The Payload Operations Center for this instrument is situated ... well, right next to where I sit! So, in collaboration with them, we decided to do some complex database management system development. Features include:
In an ideally planned project, such a system would allow storage of detailed metadata, sufficient to construct any version of the data product. Thus, as new versions get created, one may dispose of these products knowing that should the need arise, the metadata can be used to reconstruct the product on demand.
The Catalina Real-time Transient Survey is a collection of three telescopes which have operated over a period spanning more than a decade. The telescopes visit specific areas of the sky at defined intervals of time and take images. These images are then processed to derive photometric measurements. IUCAA is a host to the image archive. Some features of this image archive include:
The site may be accessed here.
The "Dark Skies For All" is an IAU 100 program which aims at bringing people together to sensitivize people around the world to the importance of preserving our dark skies. By reaching out to the media, government agencies, colleges and institutes, the dark sky ambassadors will try to achieve the goal of helping people preserve the beauty of the night skies.
I led the team that constructed the portal for managing this program. The portal can be accessed from this link.
The Office of Astronomy for Development (OAD) is a joint project of the International Astronomical Union (IAU) and the South African National Research Foundation (NRF) with the support of the Department of Science and Technology (DST). The OAD primarily carries out its mission by funding and coordinating projects that use astronomy as a tool to address issues related to sustainable development. Since 2013, more than 120 projects have been funded through the annual Call for Proposals.
In the first part of the project, we helped the OAD adapt the Alfresco Enterprise Content Management system to manage their project sources. We then built an interface that enables users to search through materials produced by OAD's past funded projects. Users specify keywords in the primary search box. They can further narrow the search results using the filters for Project Title, Location, Year (of project implementation), and SDG (View the Sustainable Development Goals)
The interface may be accessed from this link.
QuickDB is an in-house database promising an SQL interface that enables astronomers to access the Subaru Hyper Suprime Camera (HSC) data. QuickDB has been designed to leverage in-RAM processing and MapReduce and allow very quick processing of SQL queries on terabyte scales of data. The SQL functionality has been extended to suit common functions needed by Astronomers such as cross-matches, histogram (1-d and 2-d), flux to magnitude conversions etc.
I led the team that constructed a scheduler and health keeping system for this database. I personally constructed a Jupyter notebook interface to interact with the scheduler. The system is currently being deployed for an upcoming data release and will be public soon.
The Office of Astronomy for Development (OAD) is a part of the Internationl Astronomical Union (IAU) and is responsible for coordinating projects around the world aimed at using Astronomy for general public good. Many good souls come forth to volunteer for these events. There has been no efficient system in place for coordinating the volunteers.
I have been working with OAD office based in Cape Town, South Africa to construct a system which helps advertise projects for which volunteers can come forward, connect project coordinators with volunteers and more.
Stars come in a wide variety of classes. For historical reasons, these stars have been given labels O, B, A, F, G, K and M. Within each category, one has sub-classes designated with numbers 0 - 9. So that's 70 classes. Plus for each type, there can be five "luminosity" classes. Modern telescopes have obtained spectra for millions of stars. It is not possible for human experts to sit down and classify these stars on the basis of the spectra. Several automated techniques have been proposed in the past. One of the most commonly used ones rely on matching a given spectrum against a set of templates. This can be quite inefficient computationally.
We managed to use autoencoders and 1-d Convolutional Neural Networks to not only improve the current state of the art's performance by a factor of two but also improve the computation speeds by several factors.