About Me

This is me!

Hi! Thank you for visiting my webpage.

I'm a junior at the University of California, Los Angeles (UCLA) and am majoring in Statistics and Data Science Engineering with a keen fascination towards Artificial Intelligence and Machine Learning. My journey has been a diverse one - my passion towards outer space rooted itself within me as a ninth-grader, igniting a profound desire to become a planetary scientist and inspiring a fervor that initially led me down the path of Geology at UCLA.

My academic path has been filled with inspirational moments, from getting to interact and take guidance from esteemed professors like Prof. David Piage (UCLA), working on groundbreaking projects like the Perseverance rover on Mars and Lunar Reconnaissance Orbiter . It was through this voyage that I was introduced to the mesmerizing world of remote sensing, marking the beginning of my endearing journey with data science.

I am fervently driven by a vision to harness Generative AI and Machine Learning and deeply passionate about their transformative potential in industries as diverse as finance, consulting, autonomous vehicles, agriculture, disaster management, energy, environment, and government. Be it finance or farming, my goal is simple: to use technology to solve problems and improve lives.

Outside of academics, I have a variety of interests and hobbies that enrich my life. Since the age of six, I've been immersed in the world of sketching and painting, experimenting with mediums from oil and acrylics to colored pencils. This artistic journey has been transformative, teaching me both attention to detail and patience. I've devoted a special section to showcase my artwork and hope it brings you as much joy as it has brought me. I also enjoy swimming, playing badminton and cricket, watching documentaries, movies and hanging out with my friends and having discussions on a wide range of topics.

I hope this website helps to gives you an idea of who I am and what I am passionate about. Below I list my skills to help you get a understanding of my technical abilities. Enjoy exploring this website!

Skills


    Programming Languages

  • C++
  • R
  • SQL
  • Python (Numpy, Pandas, Matplotlib, Seaborn, Plotly, Scikit-Learn)
  • Shell
  • HTML
  • CSS

  • AI Libraries

  • Langchain
  • Openai
  • Tensorflow
  • Huggingface
  • PandasAI

    Large Language Models(LLM) and Techniques

  • GPT-3.5-turbo
  • Llama-2
  • Prompt Engineering

    Version Control

  • Git
  • Github

    Softwares and Tools

  • Docker
  • Xcode
  • Visual Studios (VSCode)
  • R-studios
  • Microsoft Suit
  • G-suit
  • Notion
  • Airtable
  • Figma (UI/UX)

    Geographic Information System(GIS) tools

  • Qgis
  • ArcGIS
For more details, please see my CV or visit the 'Contact' section to know how we could get in touch.

Projects


My journey as an enthusiast has led me to explore a myriad of domains and industries, ranging from the awe-inspiring realms of space and earth science to the captivating world of business analytics. Here, you will find a diverse collection of my works, including my passion-fueled hobby projects, academic endeavors, and impactful internship projects. At the heart of my analytical pursuits lies Python, serving as the cornerstone of all my data-driven analyses. Within the realm of remote sensing, I harness the power of satellite imagery from sentinel-2 to detect and isolate wildfires, while also leveraging drone imagery to gain insights into vegetation stress. Venturing into the world of business analytics and predictive modeling, I rely on the prowess of scikit-learn to employ machine learning models, deftly addressing a wide array of challenges. I have currently delved into the intriguing realm of Generative AI. My focus lies in harnessing large language models to craft sophisticated systems tailored for a diverse array of use cases.

Below, I list my current and past projects. For an up-to-date record of my projects, please see my github page or CV.

    Current projects

  • Advancing Generative AI at Deloitte: A Comprehensive System for Code Generation, Explanation, Conversion, Debugging, and Optimization
  • Advancing Generative AI at Deloitte: A Comprehensive System for datafarme analysis and Visualizations using GPT and PandasAI


  • Past projects

  • Dockerized ML Package Deployment in Ghana: Empowering AI Solutions business growth
  • Credit Default Prediction Model: Identifying High-Risk Individuals Based on Personal Characteristic
  • Drone-Mounted IR Camera System for Vegetation Health Analysis: Enhancing Environmental Monitoring and Management
  • Satellite-Based Fire Detection and Water and Land Cover Visualization Using Copernicus Data from ESA
Below, I give an overview of my current and past projects that I have led or where I made substantial contributions.


  • Leveraging Generative AI for code generation, explanation, conversion, debugging and optimization from natural language prompts




    A video explaining how AI is being leveraged to bring about transformation in financial institutions and businesses. Credits: Deloitte AI Institute US



    This project has a noble goal of making coding accessible to everyone, including those without technical expertise. To achieve this, I utilize GPT-3.5 Turbo model, which I fine-tune using prompt engineering to tailor it for specific tasks.

    My approach involves a well-structured flow control system that ensures safety and efficiency. First, I implement a moderation check to maintain a friendly and appropriate environment. Then, I validate user input to ensure it aligns with the intended coding function. If the input is valid, the system proceeds to carry out the user's requested task. To maintain quality and accuracy, the model evaluates its own performance after each task. Additionally, I have taken care to include necessary error handling and try-except blocks to handle any unexpected situations that may arise.

    In addition to the features mentioned earlier, the project goes beyond simple task-based interactions. I have implemented a conversation capability with the model, allowing users to ask follow-up or recurrent questions related to previous tasks. This is made possible by enabling memory storage and context understanding in the system.



    A video by DeepLearning.AI on building systems using Chat Gpt


    By incorporating memory storage and context understanding, I aim to make the coding journey even more personalized and user-friendly, making it easier for non-technical individuals to grasp coding concepts and apply them effectively in their projects. This further enhances the democratization of coding and empowers users to engage in meaningful and productive coding conversations with the system.

  • Leveraging Generative AI for Comprehensive DataFrame Analysis Based on Contextual Understanding of User Query


    Moderation constituents

    Stock Image

    The primary objective of this project is to democratize advanced DataFrame analysis, making it accessible to both technical and non-technical users alike. By allowing users to upload datasets and pose questions in natural language, the system generates comprehensive, contextually relevant answers that encompass textual explanations, graphical representations, and data visualizations.

    From a technological perspective, the project is anchored on the integration of GPT and a specialized library called PandasAI, specifically designed for DataFrame analysis. Importantly, we have customized PandasAI's source code to align it with our unique use-cases. This customization extends to modifying prompts and editing, adding, or removing functions that process code snippets generated by the model for execution. Such fine-tuning grants the system the versatility and accuracy required to manage a diverse range of data types and user queries to provide only the neccesary output.

    One of the significant challenges in DataFrame analysis is the merging of data from disparate sources. This is where the project distinguishes itself. Through prompt engineering, the language model itself generates instructions on which DataFrames to select for the most relevant data, which is then formatted and used in subsequent merging processes. The system is enabled to understand user queries in context, subsequently identifying common columns across multiple DataFrames to act as keys for merging. This ensures that the answers generated are not just accurate but are based on a comprehensive view of all available data.

    Moderation constituents

    PandasAI

    Main salient feature of the project is its data visualization capabilities. Post data merging and analysis, the system utilizes HTML and JavaScript to produce interactive data plots, offering a superior user experience compared to standard Python plots. Furthermore, the system has undergone extensive testing to handle a range of user queries effectively, even those that may be partially coherent or unclear as well as bait questions aimed at confusing the model. This iterative process of testing, debugging, and prompt editing has yielded an impressive accuracy rate of 90%. Multiple checkpoints and output-formatting guidelines have been incorporated to reduce inconsistencies and enhance the quality of the output.

    Addressing the common issue of missing data in data analysis, the system provides users with two primary options: interpolation and filling of null values. These methods are tailored to the data type of each column. For numerical columns, the system employs interpolation techniques that minimize the Residual Sum of Squares (RSS), and fill null values fills rows with the mean values ensuring more accurate data representation.

    In summary, this project aims not merely to simplify DataFrame analysis but to fundamentally transform it into a more intuitive and user-friendly experience. Through leveraging advanced AI technologies and extensive customization, it significantly reduces the expertise and time needed for comprehensive data analysis, thereby serving the larger goal of making data science accessible to all..

  • Dockerization of 'NAVIK-MARKETING-AI' system for deployment in Ghana

    Moderation constituents

    Source: Absolute Data

    The project revolved around the dockerization of the 'Navik - Marketing AI' application within a Python environment, specifically tailored for deployment by a telecommunications firm in Africa looking to expand its services in Ghanaian Market. This strategic move aimed at leveraging the power of containerization and automation to streamline the deployment process, making it more efficient and scalable. The implementation of the project had a significant impact on potential revenue growth in Ghana, with an estimated increase of 15%.

    Simple workflow

    Source: Simplistic Workflow

    During the course of the project, I created a test Python package to deepen my understanding of Docker and Git. This hands-on approach allowed for practical experimentation with containerization and version control, two essential aspects in modern software development. Furthermore, the project provided an opportunity to delve into the intricacies of business analytics and machine learning strategies used within the package. This encompassed gaining insights into methods for predicting Customer Lifetime Value (CLTV), exploring avenues for cross-selling and up-selling, and devising techniques for churn reduction and customer retention. The acquired knowledge not only enriched the technical skill set but also offered a comprehensive understanding of how data-driven strategies can be employed to drive tangible business results.

    To know more about how AI/ML can be used to scale businesses visit ABSOLUTDATA

  • Credit Default Prediction Model: Identifying High-Risk Individuals Based on Personal Characteristics


    Credit cards play a vital role as indispensable tools that enable individuals to make significant purchases that would otherwise be impossible. For our project, we focused on studying credit card late fees by using personal characteristics of individuals. To achieve this, we thoroughly evaluated our dataset using a range of models to identify the most effective approach for predicting the likelihood of late fees.

    In a sequential manner, we initiated our data analysis by dividing the dataset into training and testing sets. Subsequently, we proceeded with data pre-processing steps. Firstly, we applied label encoding to convert categorical variables into numerical representations. For categories with a more intuitive meaning, we employed a one-to-one mapping approach to maintain their interpretability.

    Following the encoding, we performed feature selection to enhance the efficiency of our model. We utilized the variance threshold method to identify and remove columns with variances above a predefined threshold. To further refine the feature set, we employed the SelectKBest algorithm to select the top 10 most predictive columns for our explanatory variable. All this helped in reducing the dimensionality of the dataset while focusing on the more influential features for our model.


    We conducted our analysis using six different models to determine the most accurate one in predicting the likelihood of incurring late fees. Below I list the models we used and the accuracy score we received:

    Logistic Regression : 0.6139
    Gaussian Naive Bayes: 0.6253
    XG Boost: 0.7014
    Random Forest: 0.7087
    K - Nearest neighbours (KNN): 0.714
    KNN Hyperparameter Optimization: 0.7405

    Through a comprehensive evaluation of the dataset using various models, we identified the KNN model with Manhattan distance and 9 neighbors as the optimal approach. However, there are several opportunities for further enhancement to improve the accuracy score. This can be achieved through implementing a Neural Network model, employing advanced feature engineering techniques, and utilizing a larger dataset. This valuable insight can assist credit card companies in making informed decisions regarding credit card approvals and managing risk effectively.

  • Drone mounted imaging system for vegetation health detection using NDVI
    Moderation constituents

    During the summer of 2022, I had the incredible opportunity to intern at the Edge of Space Academy, a pioneering institution in the field of spaceflight instrumentation and mission design at the University of Iowa. I was the Project Manager for the Ashton Prairie Near Infrared Sensing team, a project that shed light on the immense potential of drone technology in land and ecology management.

    The Ashton Prairie project was an innovative initiative that aimed to leverage near-infrared sensing technology to monitor vegetation health during a heatwave, all accomplished under a budget of $1000. Our team constructed the imaging system's hardware from scratch, which encompassed two cameras, two Raspberry Pi units, a battery pack, and a 3D-printed container. Each component was meticulously integrated with the drone, ensuring optimal stability and safety during operation.

    Moderation constituents

    Source: NDVI

    I created a python script for post-processing of images and extracted the relevant pixel values from the arial images taken in both RGB and IR. We used Normalized Difference Vegetation Index (NDVI) values to discern the health and stress levels of vegetation during the heatwave. The NDVI is a key indicator of live green vegetation, and our analysis provided valuable insights into how the vegetation responded to the heatwave conditions. I then created a false image to show vegetation stress based on the NDVI values. This data could be used to anticipate potential ecological changes and implement preventive measures in the future.

    RGB
    False Image - NDVI

    Not only this we also provided a proof of concept for rendering a 3D model of vegetation height using structure from motion photogrammetric range imaging technique.


    Rendering of 3D structure using Agisoft

    The project underlined the importance of interdisciplinary collaboration in finding cost-effective solutions to pressing environmental issues. This also presents an affordable and viable alternative for small-scale farmers worldwide, enabling them to carry out land assessment and management independently, rather than depending on commercial entities.

  • Satellite-Based Fire Detection and Water and Land Cover Visualization Using Copernicus Data from ESA
    Moderation constituents

    For this project, I utilized Sentinel-2 satellite data obtained from the open-source Copernicus hub, courtesy of the European Space Agency. The Sentinel-2 satellite records images across 12 unique spectral bands, each revealing distinct, hidden features at varying wavelengths. This diverse range of data provides the flexibility to manipulate these layers in order to extract relevant information.

    Moderation constituents

    Source: Bands

    I make use of a number of indexes to extract information about land cover, water cover and specially to detect fire. Below I list the Indexes used and the corresponding bands:

    Normalized Differential Vegetation Index (NDVI) : B4 and B8
    Normalized Difference Water Index (NDWI) : B3 and B8
    Burn Area Index (BAI) : B11 and B8

    I did try and use other indexes such as Normalized Difference Built-up index (NDBI) and Normalized Difference Snow Index (NDSI) to experiment and see what I might get. There appeared to be a few pixxels which did highlight the fire but not significant enough.

    Below I present my findings and what each index helps in identifying and quantifying. I make of use of imagnery from Northern California, a region in general vicinity of Yosemite National Park and Lake Tahoe. For exact Coordinates view here

    Moderation constituents

    RBG image of the study region (B4, B3, B2)

    Normalized Differential Vegetation Index (NDVI)
    NDVI can be used to assess the dryness or health of the vegetation. Regions with decreasing NDVI values over time might be drying out, making them more susceptible to fires. In the below picture the dark orange regions represent lakes whereas the central light orange region shows smoke. The false color composite image in Red and NIR band shows the features as seen by naked eye.

    Moderation constituents

    NDVI (B8 - B4 / B8 + B4)
    Moderation constituents

    As seen by naked eye in the Red band (B4)

    Normalized Differential Water Index (NDWI)
    NDWI is useful for detecting and monitoring open water surfaces, such as lakes, rivers, reservoirs, and flood extents. It exploits the fact that open water absorbs more visible light and reflects more of the near-infrared spectrum. Green band and the Near-Infrared bands are used in this Index. Positive values of NDWI typically indicate water, whereas negative values point to non-water features.

    Moderation constituents

    NDWI (B3 - B8 / B3 + B8)
    Moderation constituents

    As seen by naked eye in the Green band (B3)

    Burn Aread Index (BAI)
    BAI is helpful in monitoring and quantifying the impact of wildfires on landscapes and ecosystems. BAI provides information about the extent and severity of burn scars resulting from fires. NIR is sensitive to healthy vegetation, while SWIR is sensitive to changes in vegetation and soil conditions, especially due to the presence of water and moisture. Higher Burn Area Index values correspond to more severe burn scars and a greater impact of the fire on the vegetation and soil of the affected area. We clearly see that this index allows us to pierce through the smoke and see what's actually happening.

    Moderation constituents

    BAI (B11 - B8 / B11 + B8)
    Moderation constituents

    As seen by naked eye when Red and Green are replaced by NIR and SWIR

Research

Moderation constituents

Throughout my academic journey at UCLA, I've been privileged to embark on an eclectic range of research projects and expeditions that have expanded the horizons of my understanding and experience. My pursuits have taken me to picturesque locales like Santa Barbara, where I collaborated with an esteemed team from NASA. Further, my intrinsic curiosity about the universe led me to delve into the profound question of extraterrestrial life, specifically focusing on the potential existence of life on Mars. Dive into the details of these adventures and more as you navigate through this page. Welcome to the chronicle of my scientific exploration.


  • Origins and Slope Variations: A study of Longmen Shan and Min Shan Mountain Systems of the Tibetan Plateau and Sichuan Basin

    Moderation constituents

    Min Shan

    In a fulfilling learning opportunity with Abijah Simons, a PhD student in the EPSS department, I embarked on a profound journey exploring the Longmen Shan and Min Shan Mountain systems that elegantly stretch across the Tibetan Plateau and the Sichuan Basin. As part of her thesis, we employed visualizations to decipher the slope gradients present in both mountain systems.

    Our first step involved harnessing the capabilities of QGIS. Through this, we meticulously crafted cross-sectional elevation profiles using Digital Elevation Models (DEM), revealing the hidden contours and elevations of both mountain systems.

    Moderation constituents

    DEM of Tibetan Planteu region

    Utilizing python, we successfully extracted the point-values along the selected contours and formulated topographic swath profiles. The results were enlightening, as they vividly illustrated the gradient variations between the two ranges. This furthered the quest to understand how the origin and geological changes map to the evident slope variations.

    Moderation constituents

    Topographic profiles of the mountain systems

    Leveraging the capabilities of ArcGIS, I created a digitized regional mineralogical map which will be made publicly available for further research .


  • Exploring the Possibility of Life on Mars: An In-depth review of Gale Crater

    Moderation constituents

    Read the full paper here

    My research delves deep into the captivating geological history of Gale Crater on Mars, aiming to illuminate how this significant feature was formed and how its environment has evolved over time. I rely on the Mars Reconnaissance Orbiter (MRO), along with invaluable data from the Mars Science Laboratory (MSL) Curiosity Rover, to answer pivotal questions about the Martian landscape and its potential for sustaining life.

    Moderation constituents

    Curiosity Rover journey of Gale Crater

    The formation of Gale Crater is a subject of particular interest, given its complex sedimentary structures. By carefully analyzing the sedimentary evidence captured by MRO, I offer insights into the initial conditions and processes that led to the creation of this intriguing Martian landmark.

    A cornerstone of this review is examining the evidence for the historical presence of liquid water in Gale Crater. My work elaborates on previous studies that have identified fan-like deposits and small channels on the northwestern rim of the crater. These features suggest the past existence of flowing water, possibly even forming a lake, adding a new layer of complexity to our understanding of Martian history.

    Moderation constituents

    Topographic profile of Gale Crater

    My analysis also extends to the role of hydrothermal activity in shaping the crater’s environment. I pay close attention to findings from the Curiosity Rover, particularly those concerning alteration halos in the Murray and Stimpson formations. These features have important implications for the mineralogical makeup of Gale Crater and contribute to our broader understanding of the region's environmental history.

    One of the most groundbreaking aspects of previous researches focused on the organics discovered in Gale Crater by the Curiosity Rover. These organic materials could have significant implications for the potential habitability of the region, making these findings especially relevant for future Mars missions aimed at identifying signs of past or present life. The below figure illustrates different regions and the probablity associated with each as a potential site for release of methane .

    Moderation constituents

    Probability of methane release

    Understanding the geological and climatic transformations of Gale Crater offers valuable contributions to the scientific community. It serves as a guidepost for future missions seeking to identify regions of Mars that have a high probability of containing bio-signatures. Consequently, this review aims to highlight the importance of different evidences pivotal in the ongoing quest to comprehend the Red Planet's capacity to sustain life, both in the past and potentially in the future.

  • Surface Biology and Geology High - Frequency Time Series (SHIFT) campaign to understand land and aquatic ecosystems

    Moderation constituents

    NASA JPL team

    This research trip was a part of a project by NASA JPL in collaboration with the UCLA department of Ecology and Evolutionary Biology under Dr. Elsa Ordway's lab to link field measurements to hyperspectral remote sensing data in order to understand weekly changes in plants and trees such as their phenology. Field research was conducted in the Sedwig Reserve opereated by UC Santa Barbara.
    The below image provides an example of what hyperspectral imagery produces for viewing phenology (not related to this research in any way, only an example of what hyperspectral imagery can help visualize)

    Moderation constituents

    credit: Continental-scale land surface phenology from harmonized Landsat 8 and Sentinel-2 imagery

    Over the two days we collected samples in pre-dawn and mid-day to take different measurements including pre-dawn water potential, mid-day water potential, relative water potential and turgor loss point. I further assisted the JPL team and the research group in processing of samples which included weighing, cutting and storage in liquid nitrogen.

    This opportunity gave me my first field trip experience and a chance to interact with NASA scientists. Moreover, I happen to be lucky as I got to see the milky way galaxy naked eye for the first time and a Falcon 9 launch from St. Andrew's Air force base.

Contact

Moderation constituents

If you have any questions, comments or suggestions regarding my work or would like to collaborate, or want to connect, please do get in touch. I’d love to hear from you!

Email

aashman0803@g.ucla.edu, aashman080303@gmail.com

Address

Department of Statistics and Data Science
University of California, Los Angeles
8125 Math Sciences Bldg. Box 951554
Los Angeles, CA 90095-1554, USA

UCLA Career Center
501 Westwood Plaza, 3rd Floor
Los Angeles, CA 90095-1573

Art


Art 1

Wolf

Medium: Color Pencils

Art 2

Leopard

Medium: Color Pencils

Art 3

Tiger

Medium: Color Pencils

Art 4

Lion

Medium: Lead Pencils

Art 5

Wolf

Medium: Lead Pencils

Art 3

Just a Bird

Medium: Lead Pencils

Art 3

Misty Mountains

Medium: Lead Pencils

Elements

Text

This is bold and this is strong. This is italic and this is emphasized. This is superscript text and this is subscript text. This is underlined and this is code: for (;;) { ... }. Finally, this is a link.


Heading Level 2

Heading Level 3

Heading Level 4

Heading Level 5
Heading Level 6

Blockquote

Fringilla nisl. Donec accumsan interdum nisi, quis tincidunt felis sagittis eget tempus euismod. Vestibulum ante ipsum primis in faucibus vestibulum. Blandit adipiscing eu felis iaculis volutpat ac adipiscing accumsan faucibus. Vestibulum ante ipsum primis in faucibus lorem ipsum dolor sit amet nullam adipiscing eu felis.

Preformatted

i = 0;

while (!deck.isInOrder()) {
    print 'Iteration ' + i;
    deck.shuffle();
    i++;
}

print 'It took ' + i + ' iterations to sort the deck.';

Lists

Unordered

  • Dolor pulvinar etiam.
  • Sagittis adipiscing.
  • Felis enim feugiat.

Alternate

  • Dolor pulvinar etiam.
  • Sagittis adipiscing.
  • Felis enim feugiat.

Ordered

  1. Dolor pulvinar etiam.
  2. Etiam vel felis viverra.
  3. Felis enim feugiat.
  4. Dolor pulvinar etiam.
  5. Etiam vel felis lorem.
  6. Felis enim et feugiat.

Icons

Actions

Table

Default

Name Description Price
Item One Ante turpis integer aliquet porttitor. 29.99
Item Two Vis ac commodo adipiscing arcu aliquet. 19.99
Item Three Morbi faucibus arcu accumsan lorem. 29.99
Item Four Vitae integer tempus condimentum. 19.99
Item Five Ante turpis integer aliquet porttitor. 29.99
100.00

Alternate

Name Description Price
Item One Ante turpis integer aliquet porttitor. 29.99
Item Two Vis ac commodo adipiscing arcu aliquet. 19.99
Item Three Morbi faucibus arcu accumsan lorem. 29.99
Item Four Vitae integer tempus condimentum. 19.99
Item Five Ante turpis integer aliquet porttitor. 29.99
100.00

Buttons

  • Disabled
  • Disabled

Form