Aashman Rastogi

About Me

Hi! Thank you for visiting my webpage.

I'm a junior at the University of California, Los Angeles (UCLA) and am majoring in Statistics and Data Science Engineering with a keen fascination towards Artificial Intelligence and Machine Learning. My journey has been a diverse one - my passion towards outer space rooted itself within me as a ninth-grader, igniting a profound desire to become a planetary scientist and inspiring a fervor that initially led me down the path of Geology at UCLA.

My academic path has been filled with inspirational moments, from getting to interact and take guidance from esteemed professors like Prof. David Piage (UCLA), working on groundbreaking projects like the Perseverance rover on Mars and Lunar Reconnaissance Orbiter . It was through this voyage that I was introduced to the mesmerizing world of remote sensing, marking the beginning of my endearing journey with data science.

I am fervently driven by a vision to harness Generative AI and Machine Learning and deeply passionate about their transformative potential in industries as diverse as finance, consulting, autonomous vehicles, agriculture, disaster management, energy, environment, and government. Be it finance or farming, my goal is simple: to use technology to solve problems and improve lives.

Outside of academics, I have a variety of interests and hobbies that enrich my life. Since the age of six, I've been immersed in the world of sketching and painting, experimenting with mediums from oil and acrylics to colored pencils. This artistic journey has been transformative, teaching me both attention to detail and patience. I've devoted a special section to showcase my artwork and hope it brings you as much joy as it has brought me. I also enjoy swimming, playing badminton and cricket, watching documentaries, movies and hanging out with my friends and having discussions on a wide range of topics.

I hope this website helps to gives you an idea of who I am and what I am passionate about. Below I list my skills to help you get a understanding of my technical abilities. Enjoy exploring this website!

Skills

Programming Languages

C++
R
SQL
Python (Numpy, Pandas, Matplotlib, Seaborn, Plotly, Scikit-Learn)
Shell
HTML
CSS

AI Libraries

Langchain
Openai
Tensorflow
Huggingface
PandasAI

Large Language Models(LLM) and Techniques
GPT-3.5-turbo
Llama-2
Prompt Engineering

Version Control
Git
Github

Softwares and Tools
Docker
Xcode
Visual Studios (VSCode)
R-studios
Microsoft Suit
G-suit
Notion
Airtable
Figma (UI/UX)

Geographic Information System(GIS) tools
Qgis
ArcGIS

For more details, please see my CV or visit the 'Contact' section to know how we could get in touch.

Projects

My journey as an enthusiast has led me to explore a myriad of domains and industries, ranging from the awe-inspiring realms of space and earth science to the captivating world of business analytics. Here, you will find a diverse collection of my works, including my passion-fueled hobby projects, academic endeavors, and impactful internship projects. At the heart of my analytical pursuits lies Python, serving as the cornerstone of all my data-driven analyses. Within the realm of remote sensing, I harness the power of satellite imagery from sentinel-2 to detect and isolate wildfires, while also leveraging drone imagery to gain insights into vegetation stress. Venturing into the world of business analytics and predictive modeling, I rely on the prowess of scikit-learn to employ machine learning models, deftly addressing a wide array of challenges. I have currently delved into the intriguing realm of Generative AI. My focus lies in harnessing large language models to craft sophisticated systems tailored for a diverse array of use cases.

Below, I list my current and past projects. For an up-to-date record of my projects, please see my github page or CV.

Current projects

Advancing Generative AI at Deloitte: A Comprehensive System for Code Generation, Explanation, Conversion, Debugging, and Optimization
Advancing Generative AI at Deloitte: A Comprehensive System for datafarme analysis and Visualizations using GPT and PandasAI

Past projects

Dockerized ML Package Deployment in Ghana: Empowering AI Solutions business growth
Credit Default Prediction Model: Identifying High-Risk Individuals Based on Personal Characteristic
Drone-Mounted IR Camera System for Vegetation Health Analysis: Enhancing Environmental Monitoring and Management
Satellite-Based Fire Detection and Water and Land Cover Visualization Using Copernicus Data from ESA

Below, I give an overview of my current and past projects that I have led or where I made substantial contributions.

Leveraging Generative AI for code generation, explanation, conversion, debugging and optimization from natural language prompts

A video explaining how AI is being leveraged to bring about transformation in financial institutions and businesses. Credits: Deloitte AI Institute US

This project has a noble goal of making coding accessible to everyone, including those without technical expertise. To achieve this, I utilize GPT-3.5 Turbo model, which I fine-tune using prompt engineering to tailor it for specific tasks.

My approach involves a well-structured flow control system that ensures safety and efficiency. First, I implement a moderation check to maintain a friendly and appropriate environment. Then, I validate user input to ensure it aligns with the intended coding function. If the input is valid, the system proceeds to carry out the user's requested task. To maintain quality and accuracy, the model evaluates its own performance after each task. Additionally, I have taken care to include necessary error handling and try-except blocks to handle any unexpected situations that may arise.

In addition to the features mentioned earlier, the project goes beyond simple task-based interactions. I have implemented a conversation capability with the model, allowing users to ask follow-up or recurrent questions related to previous tasks. This is made possible by enabling memory storage and context understanding in the system.

A video by DeepLearning.AI on building systems using Chat Gpt

By incorporating memory storage and context understanding, I aim to make the coding journey even more personalized and user-friendly, making it easier for non-technical individuals to grasp coding concepts and apply them effectively in their projects. This further enhances the democratization of coding and empowers users to engage in meaningful and productive coding conversations with the system.
Leveraging Generative AI for Comprehensive DataFrame Analysis Based on Contextual Understanding of User Query

Stock Image

The primary objective of this project is to democratize advanced DataFrame analysis, making it accessible to both technical and non-technical users alike. By allowing users to upload datasets and pose questions in natural language, the system generates comprehensive, contextually relevant answers that encompass textual explanations, graphical representations, and data visualizations.

From a technological perspective, the project is anchored on the integration of GPT and a specialized library called PandasAI, specifically designed for DataFrame analysis. Importantly, we have customized PandasAI's source code to align it with our unique use-cases. This customization extends to modifying prompts and editing, adding, or removing functions that process code snippets generated by the model for execution. Such fine-tuning grants the system the versatility and accuracy required to manage a diverse range of data types and user queries to provide only the neccesary output.

One of the significant challenges in DataFrame analysis is the merging of data from disparate sources. This is where the project distinguishes itself. Through prompt engineering, the language model itself generates instructions on which DataFrames to select for the most relevant data, which is then formatted and used in subsequent merging processes. The system is enabled to understand user queries in context, subsequently identifying common columns across multiple DataFrames to act as keys for merging. This ensures that the answers generated are not just accurate but are based on a comprehensive view of all available data.

PandasAI

Main salient feature of the project is its data visualization capabilities. Post data merging and analysis, the system utilizes HTML and JavaScript to produce interactive data plots, offering a superior user experience compared to standard Python plots. Furthermore, the system has undergone extensive testing to handle a range of user queries effectively, even those that may be partially coherent or unclear as well as bait questions aimed at confusing the model. This iterative process of testing, debugging, and prompt editing has yielded an impressive accuracy rate of 90%. Multiple checkpoints and output-formatting guidelines have been incorporated to reduce inconsistencies and enhance the quality of the output.

Addressing the common issue of missing data in data analysis, the system provides users with two primary options: interpolation and filling of null values. These methods are tailored to the data type of each column. For numerical columns, the system employs interpolation techniques that minimize the Residual Sum of Squares (RSS), and fill null values fills rows with the mean values ensuring more accurate data representation.

In summary, this project aims not merely to simplify DataFrame analysis but to fundamentally transform it into a more intuitive and user-friendly experience. Through leveraging advanced AI technologies and extensive customization, it significantly reduces the expertise and time needed for comprehensive data analysis, thereby serving the larger goal of making data science accessible to all..
Dockerization of 'NAVIK-MARKETING-AI' system for deployment in Ghana

Source: Absolute Data

The project revolved around the dockerization of the 'Navik - Marketing AI' application within a Python environment, specifically tailored for deployment by a telecommunications firm in Africa looking to expand its services in Ghanaian Market. This strategic move aimed at leveraging the power of containerization and automation to streamline the deployment process, making it more efficient and scalable. The implementation of the project had a significant impact on potential revenue growth in Ghana, with an estimated increase of 15%.

Source: Simplistic Workflow

During the course of the project, I created a test Python package to deepen my understanding of Docker and Git. This hands-on approach allowed for practical experimentation with containerization and version control, two essential aspects in modern software development. Furthermore, the project provided an opportunity to delve into the intricacies of business analytics and machine learning strategies used within the package. This encompassed gaining insights into methods for predicting Customer Lifetime Value (CLTV), exploring avenues for cross-selling and up-selling, and devising techniques for churn reduction and customer retention. The acquired knowledge not only enriched the technical skill set but also offered a comprehensive understanding of how data-driven strategies can be employed to drive tangible business results.

To know more about how AI/ML can be used to scale businesses visit ABSOLUTDATA
Credit Default Prediction Model: Identifying High-Risk Individuals Based on Personal Characteristics

Credit cards play a vital role as indispensable tools that enable individuals to make significant purchases that would otherwise be impossible. For our project, we focused on studying credit card late fees by using personal characteristics of individuals. To achieve this, we thoroughly evaluated our dataset using a range of models to identify the most effective approach for predicting the likelihood of late fees.

In a sequential manner, we initiated our data analysis by dividing the dataset into training and testing sets. Subsequently, we proceeded with data pre-processing steps. Firstly, we applied label encoding to convert categorical variables into numerical representations. For categories with a more intuitive meaning, we employed a one-to-one mapping approach to maintain their interpretability.

Following the encoding, we performed feature selection to enhance the efficiency of our model. We utilized the variance threshold method to identify and remove columns with variances above a predefined threshold. To further refine the feature set, we employed the SelectKBest algorithm to select the top 10 most predictive columns for our explanatory variable. All this helped in reducing the dimensionality of the dataset while focusing on the more influential features for our model.

We conducted our analysis using six different models to determine the most accurate one in predicting the likelihood of incurring late fees. Below I list the models we used and the accuracy score we received:
⚬ Logistic Regression : 0.6139
⚬ Gaussian Naive Bayes: 0.6253
⚬ XG Boost: 0.7014
⚬ Random Forest: 0.7087
⚬ K - Nearest neighbours (KNN): 0.714
⚬ KNN Hyperparameter Optimization: 0.7405

Through a comprehensive evaluation of the dataset using various models, we identified the KNN model with Manhattan distance and 9 neighbors as the optimal approach. However, there are several opportunities for further enhancement to improve the accuracy score. This can be achieved through implementing a Neural Network model, employing advanced feature engineering techniques, and utilizing a larger dataset. This valuable insight can assist credit card companies in making informed decisions regarding credit card approvals and managing risk effectively.
Drone mounted imaging system for vegetation health detection using NDVI

During the summer of 2022, I had the incredible opportunity to intern at the Edge of Space Academy, a pioneering institution in the field of spaceflight instrumentation and mission design at the University of Iowa. I was the Project Manager for the Ashton Prairie Near Infrared Sensing team, a project that shed light on the immense potential of drone technology in land and ecology management.

The Ashton Prairie project was an innovative initiative that aimed to leverage near-infrared sensing technology to monitor vegetation health during a heatwave, all accomplished under a budget of $1000. Our team constructed the imaging system's hardware from scratch, which encompassed two cameras, two Raspberry Pi units, a battery pack, and a 3D-printed container. Each component was meticulously integrated with the drone, ensuring optimal stability and safety during operation.

Source: NDVI

I created a python script for post-processing of images and extracted the relevant pixel values from the arial images taken in both RGB and IR. We used Normalized Difference Vegetation Index (NDVI) values to discern the health and stress levels of vegetation during the heatwave. The NDVI is a key indicator of live green vegetation, and our analysis provided valuable insights into how the vegetation responded to the heatwave conditions. I then created a false image to show vegetation stress based on the NDVI values. This data could be used to anticipate potential ecological changes and implement preventive measures in the future.

RGB

False Image - NDVI

Not only this we also provided a proof of concept for rendering a 3D model of vegetation height using structure from motion photogrammetric range imaging technique.

Rendering of 3D structure using Agisoft

The project underlined the importance of interdisciplinary collaboration in finding cost-effective solutions to pressing environmental issues. This also presents an affordable and viable alternative for small-scale farmers worldwide, enabling them to carry out land assessment and management independently, rather than depending on commercial entities.
Satellite-Based Fire Detection and Water and Land Cover Visualization Using Copernicus Data from ESA

For this project, I utilized Sentinel-2 satellite data obtained from the open-source Copernicus hub, courtesy of the European Space Agency. The Sentinel-2 satellite records images across 12 unique spectral bands, each revealing distinct, hidden features at varying wavelengths. This diverse range of data provides the flexibility to manipulate these layers in order to extract relevant information.

Source: Bands

I make use of a number of indexes to extract information about land cover, water cover and specially to detect fire. Below I list the Indexes used and the corresponding bands:

⚬ Normalized Differential Vegetation Index (NDVI) : B4 and B8
⚬ Normalized Difference Water Index (NDWI) : B3 and B8
⚬ Burn Area Index (BAI) : B11 and B8

I did try and use other indexes such as Normalized Difference Built-up index (NDBI) and Normalized Difference Snow Index (NDSI) to experiment and see what I might get. There appeared to be a few pixxels which did highlight the fire but not significant enough.

Below I present my findings and what each index helps in identifying and quantifying. I make of use of imagnery from Northern California, a region in general vicinity of Yosemite National Park and Lake Tahoe. For exact Coordinates view here

RBG image of the study region (B4, B3, B2)

Normalized Differential Vegetation Index (NDVI)
NDVI can be used to assess the dryness or health of the vegetation. Regions with decreasing NDVI values over time might be drying out, making them more susceptible to fires. In the below picture the dark orange regions represent lakes whereas the central light orange region shows smoke. The false color composite image in Red and NIR band shows the features as seen by naked eye.

NDVI (B8 - B4 / B8 + B4)

As seen by naked eye in the Red band (B4)

Normalized Differential Water Index (NDWI)
NDWI is useful for detecting and monitoring open water surfaces, such as lakes, rivers, reservoirs, and flood extents. It exploits the fact that open water absorbs more visible light and reflects more of the near-infrared spectrum. Green band and the Near-Infrared bands are used in this Index. Positive values of NDWI typically indicate water, whereas negative values point to non-water features.

NDWI (B3 - B8 / B3 + B8)

As seen by naked eye in the Green band (B3)

Burn Aread Index (BAI)
BAI is helpful in monitoring and quantifying the impact of wildfires on landscapes and ecosystems. BAI provides information about the extent and severity of burn scars resulting from fires. NIR is sensitive to healthy vegetation, while SWIR is sensitive to changes in vegetation and soil conditions, especially due to the presence of water and moisture. Higher Burn Area Index values correspond to more severe burn scars and a greater impact of the fire on the vegetation and soil of the affected area. We clearly see that this index allows us to pierce through the smoke and see what's actually happening.

BAI (B11 - B8 / B11 + B8)

As seen by naked eye when Red and Green are replaced by NIR and SWIR

Research

Throughout my academic journey at UCLA, I've been privileged to embark on an eclectic range of research projects and expeditions that have expanded the horizons of my understanding and experience. My pursuits have taken me to picturesque locales like Santa Barbara, where I collaborated with an esteemed team from NASA. Further, my intrinsic curiosity about the universe led me to delve into the profound question of extraterrestrial life, specifically focusing on the potential existence of life on Mars. Dive into the details of these adventures and more as you navigate through this page. Welcome to the chronicle of my scientific exploration.

Origins and Slope Variations: A study of Longmen Shan and Min Shan Mountain Systems of the Tibetan Plateau and Sichuan Basin

Min Shan

In a fulfilling learning opportunity with Abijah Simons, a PhD student in the EPSS department, I embarked on a profound journey exploring the Longmen Shan and Min Shan Mountain systems that elegantly stretch across the Tibetan Plateau and the Sichuan Basin. As part of her thesis, we employed visualizations to decipher the slope gradients present in both mountain systems.

Our first step involved harnessing the capabilities of QGIS. Through this, we meticulously crafted cross-sectional elevation profiles using Digital Elevation Models (DEM), revealing the hidden contours and elevations of both mountain systems.

DEM of Tibetan Planteu region

Utilizing python, we successfully extracted the point-values along the selected contours and formulated topographic swath profiles. The results were enlightening, as they vividly illustrated the gradient variations between the two ranges. This furthered the quest to understand how the origin and geological changes map to the evident slope variations.

Topographic profiles of the mountain systems

Leveraging the capabilities of ArcGIS, I created a digitized regional mineralogical map which will be made publicly available for further research .
Exploring the Possibility of Life on Mars: An In-depth review of Gale Crater

Read the full paper here

My research delves deep into the captivating geological history of Gale Crater on Mars, aiming to illuminate how this significant feature was formed and how its environment has evolved over time. I rely on the Mars Reconnaissance Orbiter (MRO), along with invaluable data from the Mars Science Laboratory (MSL) Curiosity Rover, to answer pivotal questions about the Martian landscape and its potential for sustaining life.

Curiosity Rover journey of Gale Crater

The formation of Gale Crater is a subject of particular interest, given its complex sedimentary structures. By carefully analyzing the sedimentary evidence captured by MRO, I offer insights into the initial conditions and processes that led to the creation of this intriguing Martian landmark.

A cornerstone of this review is examining the evidence for the historical presence of liquid water in Gale Crater. My work elaborates on previous studies that have identified fan-like deposits and small channels on the northwestern rim of the crater. These features suggest the past existence of flowing water, possibly even forming a lake, adding a new layer of complexity to our understanding of Martian history.

Topographic profile of Gale Crater

My analysis also extends to the role of hydrothermal activity in shaping the crater’s environment. I pay close attention to findings from the Curiosity Rover, particularly those concerning alteration halos in the Murray and Stimpson formations. These features have important implications for the mineralogical makeup of Gale Crater and contribute to our broader understanding of the region's environmental history.

One of the most groundbreaking aspects of previous researches focused on the organics discovered in Gale Crater by the Curiosity Rover. These organic materials could have significant implications for the potential habitability of the region, making these findings especially relevant for future Mars missions aimed at identifying signs of past or present life. The below figure illustrates different regions and the probablity associated with each as a potential site for release of methane .

Probability of methane release

Understanding the geological and climatic transformations of Gale Crater offers valuable contributions to the scientific community. It serves as a guidepost for future missions seeking to identify regions of Mars that have a high probability of containing bio-signatures. Consequently, this review aims to highlight the importance of different evidences pivotal in the ongoing quest to comprehend the Red Planet's capacity to sustain life, both in the past and potentially in the future.
Surface Biology and Geology High - Frequency Time Series (SHIFT) campaign to understand land and aquatic ecosystems

NASA JPL team

This research trip was a part of a project by NASA JPL in collaboration with the UCLA department of Ecology and Evolutionary Biology under Dr. Elsa Ordway's lab to link field measurements to hyperspectral remote sensing data in order to understand weekly changes in plants and trees such as their phenology. Field research was conducted in the Sedwig Reserve opereated by UC Santa Barbara.
The below image provides an example of what hyperspectral imagery produces for viewing phenology (not related to this research in any way, only an example of what hyperspectral imagery can help visualize)

credit: Continental-scale land surface phenology from harmonized Landsat 8 and Sentinel-2 imagery

Over the two days we collected samples in pre-dawn and mid-day to take different measurements including pre-dawn water potential, mid-day water potential, relative water potential and turgor loss point. I further assisted the JPL team and the research group in processing of samples which included weighing, cutting and storage in liquid nitrogen.

This opportunity gave me my first field trip experience and a chance to interact with NASA scientists. Moreover, I happen to be lucky as I got to see the milky way galaxy naked eye for the first time and a Falcon 9 launch from St. Andrew's Air force base.

Contact

If you have any questions, comments or suggestions regarding my work or would like to collaborate, or want to connect, please do get in touch. I’d love to hear from you!

Email

aashman0803@g.ucla.edu, aashman080303@gmail.com

Address

Department of Statistics and Data Science
University of California, Los Angeles
8125 Math Sciences Bldg. Box 951554
Los Angeles, CA 90095-1554, USA

UCLA Career Center
501 Westwood Plaza, 3rd Floor
Los Angeles, CA 90095-1573

/span>

Art

Wolf

Medium: Color Pencils

Leopard

Medium: Color Pencils

Tiger

Medium: Color Pencils

Lion

Medium: Lead Pencils

Wolf

Medium: Lead Pencils

Just a Bird

Medium: Lead Pencils

Misty Mountains

Medium: Lead Pencils

Elements

Text

This is bold and this is strong. This is italic and this is emphasized. This is ^superscript text and this is _subscript text. This is underlined and this is code: for (;;) { ... }. Finally, this is a link.

Heading Level 2

Heading Level 3

Heading Level 4

Heading Level 5

Heading Level 6

Blockquote

Fringilla nisl. Donec accumsan interdum nisi, quis tincidunt felis sagittis eget tempus euismod. Vestibulum ante ipsum primis in faucibus vestibulum. Blandit adipiscing eu felis iaculis volutpat ac adipiscing accumsan faucibus. Vestibulum ante ipsum primis in faucibus lorem ipsum dolor sit amet nullam adipiscing eu felis.

Preformatted

i = 0;

while (!deck.isInOrder()) {
    print 'Iteration ' + i;
    deck.shuffle();
    i++;
}

print 'It took ' + i + ' iterations to sort the deck.';

Lists

Unordered

Dolor pulvinar etiam.
Sagittis adipiscing.
Felis enim feugiat.

Alternate

Dolor pulvinar etiam.
Sagittis adipiscing.
Felis enim feugiat.

Ordered

Dolor pulvinar etiam.
Etiam vel felis viverra.
Felis enim feugiat.
Dolor pulvinar etiam.
Etiam vel felis lorem.
Felis enim et feugiat.

Icons

Actions

Table

Default

Name	Description	Price
Item One	Ante turpis integer aliquet porttitor.	29.99
Item Two	Vis ac commodo adipiscing arcu aliquet.	19.99
Item Three	Morbi faucibus arcu accumsan lorem.	29.99
Item Four	Vitae integer tempus condimentum.	19.99
Item Five	Ante turpis integer aliquet porttitor.	29.99
		100.00

Alternate

Name	Description	Price
Item One	Ante turpis integer aliquet porttitor.	29.99
Item Two	Vis ac commodo adipiscing arcu aliquet.	19.99
Item Three	Morbi faucibus arcu accumsan lorem.	29.99
Item Four	Vitae integer tempus condimentum.	19.99
Item Five	Ante turpis integer aliquet porttitor.	29.99
		100.00

Buttons

Icon
Icon

Disabled
Disabled

About Me

Skills

Programming Languages

AI Libraries

Large Language Models(LLM) and Techniques

Version Control

Softwares and Tools

Geographic Information System(GIS) tools

Projects

Current projects

Past projects

Leveraging Generative AI for code generation, explanation, conversion, debugging and optimization from natural language prompts

Leveraging Generative AI for Comprehensive DataFrame Analysis Based on Contextual Understanding of User Query

Dockerization of 'NAVIK-MARKETING-AI' system for deployment in Ghana

Credit Default Prediction Model: Identifying High-Risk Individuals Based on Personal Characteristics

Drone mounted imaging system for vegetation health detection using NDVI

Satellite-Based Fire Detection and Water and Land Cover Visualization Using Copernicus Data from ESA

Research

Origins and Slope Variations: A study of Longmen Shan and Min Shan Mountain Systems of the Tibetan Plateau and Sichuan Basin

Exploring the Possibility of Life on Mars: An In-depth review of Gale Crater

Surface Biology and Geology High - Frequency Time Series (SHIFT) campaign to understand land and aquatic ecosystems

Contact

Email

Address

Art

Wolf

Leopard

Tiger

Lion

Wolf

Just a Bird

Misty Mountains

Elements

Text

Heading Level 2

Heading Level 3

Heading Level 4

Heading Level 5

Heading Level 6

Blockquote

Preformatted

Lists

Unordered

Alternate

Ordered

Icons

Actions

Table

Default

Alternate

Buttons

Form