{
"cells": [
{
"cell_type": "markdown",
"id": "32605182",
"metadata": {},
"source": [
"---\n",
"title: Astrophysics Papers Daily Summaries Notebook with the Jetstream LLM Inference Service\n",
"date: 2025-05-08\n",
"categories:\n",
" - astrophysics\n",
" - llm\n",
" - jetstream2\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "0ede76e1",
"metadata": {},
"source": [
"# Astrophysics Papers: Daily Summaries Notebook\n",
"\n",
"Fetch today's `astro-ph` papers from arXiv, summarize abstracts with `llm`, and output a Markdown summary.\n",
"\n",
"This is an example of using the [Jetstream Inference Service](https://docs.jetstream-cloud.org/inference-service/overview/), notice that you need first to [configure the `llm` package to access the Jetstream Inference Service via the API](https://docs.jetstream-cloud.org/inference-service/api/)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "5d18ec41",
"metadata": {},
"outputs": [],
"source": [
"# Install required packages\n",
"#!pip install llm requests"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "e4b93591",
"metadata": {},
"outputs": [],
"source": [
"import datetime\n",
"import requests\n",
"import xml.etree.ElementTree as ET\n",
"import llm\n",
"\n",
"# Get default LLM model (uses configured default, e.g. deepseek) citeturn0search0\n",
"model = llm.get_model()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "746a83f0",
"metadata": {},
"outputs": [],
"source": [
"# Define system prompt for concise summaries citeturn0search0\n",
"system_prompt = (\n",
" 'You are an expert summarization assistant. '\n",
" 'Provide a single concise sentence capturing the main result of an astrophysics abstract.'\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "893a7a13",
"metadata": {},
"outputs": [],
"source": [
"# Fetch today's astro-ph submissions from arXiv API\n",
"\n",
"# Download and parse today's astro-ph RSS feed from arXiv, take only the latest 10 papers\n",
"rss_url = \"https://rss.arxiv.org/rss/astro-ph\"\n",
"res = requests.get(rss_url)\n",
"root = ET.fromstring(res.content)\n",
"items = root.find('channel').findall('item')[:3]\n",
"\n",
"papers = []\n",
"for item in items:\n",
" title = item.findtext('title', default='')\n",
" link = item.findtext('link', default='')\n",
" desc = item.findtext('description', default='')\n",
" author = item.findtext('{http://purl.org/dc/elements/1.1/}creator', default='')\n",
" # Extract abstract from description (after 'Abstract:')\n",
" abstract = ''\n",
" if 'Abstract:' in desc:\n",
" abstract = desc.split('Abstract:', 1)[1].strip()\n",
" papers.append({\n",
" 'title': title,\n",
" 'link': link,\n",
" 'abstract': abstract,\n",
" 'author': author,\n",
" 'inst': ''\n",
" })"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "455bd1ba",
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "013f30f79f974e51b5bda7bd69ebe617",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Summarizing papers: 0%| | 0/3 [00:00, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from tqdm.notebook import tqdm\n",
"import re\n",
"\n",
"today = datetime.date.today().strftime('%Y-%m-%d')\n",
"# Summarize abstracts and build Markdown content citeturn0search0\n",
"lines = [f'# Astrophysics Papers for {today}\\n']\n",
"for p in tqdm(papers, desc=\"Summarizing papers\"):\n",
" resp = model.prompt(p['abstract'], system=system_prompt)\n",
" # Remove any ... blocks from the response text\n",
" summary = re.sub(r'.*?', '', resp.text(), flags=re.DOTALL).strip()\n",
" lines.append(\n",
" f\"## {p['title']}\\n\"\n",
" f\"- **Author:** {p['author']}\\n\"\n",
" f\"- **Link:** {p['link']}\\n\"\n",
" f\"**Summary:** {summary}\\n\"\n",
" )\n",
"md = '\\n'.join(lines)"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "e79a68a8",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Saved summary to astro_ph_summaries_2025-05-08.md\n"
]
}
],
"source": [
"# Save Markdown summary\n",
"fn = f'astro_ph_summaries_{today}.md'\n",
"with open(fn, 'w') as f:\n",
" f.write(md)\n",
"print(f'Saved summary to {fn}')"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "6f83253c",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"# Astrophysics Papers for 2025-05-08\n",
"\n",
"## Machine Learning Workflow for Morphological Classification of Galaxies\n",
"- **Author:** Bernd Doser, Kai L. Polsterer, Andreas Fehlner, Sebastian Trujillo-Gomez\n",
"- **Institution:** \n",
"- **Link:** https://arxiv.org/abs/2505.04676\n",
"**Summary:** The study presents a reproducible, scalable machine-learning workflow leveraging open-source tools and FAIR principles to efficiently analyze exascale astrophysical simulations, enabling collaborative exploration of galaxy morphologies.\n",
"\n",
"## A data-driven approach for star formation parameterization using symbolic regression\n",
"- **Author:** Diane M. Salim, Matthew E. Orr, Blakesley Burkhart, Rachel S. Somerville, Miles Cramner\n",
"- **Institution:** \n",
"- **Link:** https://arxiv.org/abs/2505.04681\n",
"**Summary:** Machine learning-driven symbolic regression applied to FIRE-2 simulations reveals that star formation rate surface density at 100 Myr scales robustly with gas surface density, velocity dispersion, and stellar surface density, converging to physically interpretable Kennicutt-Schmidt-like relations that capture intrinsic scatter.\n",
"\n",
"## Reassessing the ZTF Volume-Limited Type Ia Supernova Sample and Its Implications for Continuous, Dust-Dependent Models of Intrinsic Scatter\n",
"- **Author:** Yukei S. Murakami, Daniel Scolnic\n",
"- **Institution:** \n",
"- **Link:** https://arxiv.org/abs/2505.04686\n",
"**Summary:** The re-analysis of the ZTF DR2 Type Ia supernovae sample using a continuous, color-dependent model (Host2D) supports the dust hypothesis for luminosity diversity, revealing methodological differences—not sample properties—as the source of prior conflicts, with a 4.0σ improvement over previous models.\n"
]
}
],
"source": [
"!cat $fn"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aad15baa",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}