Data Science Applications for Astronomy
Week 15: Course Reflections:
Putting the Pieces Together
Looking back
Course Overview
Students will build practical data science skills (e.g., querying astronomical databases, data storage and manipulation, data visualization, exploratory and explanatory data analysis, Bayesian modeling workflows, and reproducible research practices) and apply these lessons to analyzing data from astronomical surveys.
Goals
Increase their data acumen[data_acumen], and
Appreciate how building data science skills can benefit astronomy & astrophysics research.
data_acumen
"We define data acumen as the ability to make good judgements about the use of data to support problem solutions." (Keller et al. 2020)
Objectives
Ingest and manipulate data from astronomical surveys.
Build, apply, assess and update astrophysically motivated models for astronomical observations.
Create visualizations for exploratory and explanatory data analyses of observations from astronomical surveys.
Synthesize the above into a dashboard to support the efficient analysis of astronomical observations.
Incorporate principles of reproducible research into their class project.
What Data Science skills have we developed?
Data Acumen
Databases, queries & storage
Ingesting data & Data wrangling
Exploratory data analysis
Model building & assessment
Explanatory data analysis
Data visualization
Reproducible research
Scientific workflows
Technical collaboration (if teamed up for project)
Scientific communications
What Data Science skills have we skipped
(or only skimmed the surface of)?
Probability & Statistics
Machine Learning (ML)
Non-parametric regression
Classification
Clustering
Density estimation
Anomaly detection
Image analysis
Artificial Intelligence (AI)
Deep learning
Computing
Data structures
Algorithms
Databases
Parallel computing
Applications
Hardware
Big Data frameworks
ML/AI tools
Software engineering
Deployment & operations
Looking forward
Foundational Classes to learn more about Data Sciences
Mathematics
Linear Algebra (MATH 220)
Probability
Elementary Probability (STAT 318)
Probability Theory (STAT/MATH 414)
Introduction to Probability and Stochastic Processes for Engineering (STAT/MATH 418)
Astrostatistics (ASTRO 415)
Programming
Intro to Programming (e.g., CMPSC 121, 122)
Data management/databases (DS 220, but one DS or CMPSC preqreq beyond CMPSC 122)
Programming Models for Big Data (DS/CMPSC 410, but several CMPSC prereqs)
Information Retrieval and Organization (e.g., IST 441, but several IST prereqs)
Machine Learning/AI
Machine Learning (DS 310; prereqs: (CMPSC 121 or CMPSC 131) and (STAT/MATH 318 or STAT/MATH 414 or STAT/MATH 418))
AI (e.g., DS/CMPSC 442, but several CMPSC prereqs)
Applied classes that connect to Data Sciences
Astrostatistics (ASTRO 415, Spring 2023)
Computational Astrophysics (ASTRO/PHYS 410, Spring 2023)
Astronomical Techniques? (ASTRO 451, Fall 2022)
Data Science Through Statistical Reasoning and Computation (STAT 380; but prereq STAT 184)
Visual Analytics for Data Sciences (DS 330; but prereq DS 220)
Research projects (e.g., ASTRO 496, summer project or thesis)
Project-based learning
Pros:
Help to motivate why need to learn things
Emphasize practical problems
Cons:
Forces you to work through implementation details
Risk learning specific tools, rather than underlying mathematics/algorithms
Specific tools used are very likely certainly become obsolete soon
No
Setup
Built with Julia 1.11.5 and
DataFrames 1.7.0HypertextLiteral 0.9.5
PlutoTeachingTools 0.3.1
PlutoUI 0.7.61
To run this tutorial locally, download this file and open it with Pluto.jl.