Data Science Applications for Astronomy

Week 15: Course Reflections:

Putting the Pieces Together

Looking back

Course Overview

Students will build practical data science skills (e.g., querying astronomical databases, data storage and manipulation, data visualization, exploratory and explanatory data analysis, Bayesian modeling workflows, and reproducible research practices) and apply these lessons to analyzing data from astronomical surveys.

Goals

  • Increase their data acumen[data_acumen], and

  • Appreciate how building data science skills can benefit astronomy & astrophysics research.

data_acumen

"We define data acumen as the ability to make good judgements about the use of data to support problem solutions." (Keller et al. 2020)

Objectives

  • Ingest and manipulate data from astronomical surveys.

  • Build, apply, assess and update astrophysically motivated models for astronomical observations.

  • Create visualizations for exploratory and explanatory data analyses of observations from astronomical surveys.

  • Synthesize the above into a dashboard to support the efficient analysis of astronomical observations.

  • Incorporate principles of reproducible research into their class project.

What Data Science skills have we developed?

  • Data Acumen

  • Databases, queries & storage

  • Ingesting data & Data wrangling

  • Exploratory data analysis

  • Model building & assessment

  • Explanatory data analysis

  • Data visualization

  • Reproducible research

  • Scientific workflows

  • Technical collaboration (if teamed up for project)

  • Scientific communications

What Data Science skills have we skipped
(or only skimmed the surface of)?

  • Probability & Statistics

  • Machine Learning (ML)

    • Non-parametric regression

    • Classification

    • Clustering

    • Density estimation

    • Anomaly detection

    • Image analysis

  • Artificial Intelligence (AI)

    • Deep learning

  • Computing

    • Data structures

    • Algorithms

    • Databases

    • Parallel computing

  • Applications

    • Hardware

    • Big Data frameworks

    • ML/AI tools

    • Software engineering

    • Deployment & operations

Looking forward

Foundational Classes to learn more about Data Sciences

Mathematics

  • Linear Algebra (MATH 220)

  • Probability

    • Elementary Probability (STAT 318)

    • Probability Theory (STAT/MATH 414)

    • Introduction to Probability and Stochastic Processes for Engineering (STAT/MATH 418)

    • Astrostatistics (ASTRO 415)

Programming

  • Intro to Programming (e.g., CMPSC 121, 122)

  • Data management/databases (DS 220, but one DS or CMPSC preqreq beyond CMPSC 122)

  • Programming Models for Big Data (DS/CMPSC 410, but several CMPSC prereqs)

  • Information Retrieval and Organization (e.g., IST 441, but several IST prereqs)

Machine Learning/AI

  • Machine Learning (DS 310; prereqs: (CMPSC 121 or CMPSC 131) and (STAT/MATH 318 or STAT/MATH 414 or STAT/MATH 418))

  • AI (e.g., DS/CMPSC 442, but several CMPSC prereqs)

Applied classes that connect to Data Sciences

  • Astrostatistics (ASTRO 415, Spring 2023)

  • Computational Astrophysics (ASTRO/PHYS 410, Spring 2023)

  • Astronomical Techniques? (ASTRO 451, Fall 2022)

  • Data Science Through Statistical Reasoning and Computation (STAT 380; but prereq STAT 184)

  • Visual Analytics for Data Sciences (DS 330; but prereq DS 220)

  • Research projects (e.g., ASTRO 496, summer project or thesis)

Project-based learning

Pros:

  • Help to motivate why need to learn things

  • Emphasize practical problems

Cons:

  • Forces you to work through implementation details

  • Risk learning specific tools, rather than underlying mathematics/algorithms

  • Specific tools used are very likely certainly become obsolete soon

No

Setup

Built with Julia 1.11.5 and

DataFrames 1.7.0
HypertextLiteral 0.9.5
PlutoTeachingTools 0.3.1
PlutoUI 0.7.61

To run this tutorial locally, download this file and open it with Pluto.jl.