Convert IPYNB to TSV
Max file size 100mb.
IPYNB vs TSV Format Comparison
| Aspect | IPYNB (Source Format) | TSV (Target Format) |
|---|---|---|
| Format Overview |
IPYNB
Jupyter Notebook
IPYNB is an interactive computational document format used by Jupyter. It stores a sequence of cells containing code, markdown text, and outputs in a JSON-based structure. Jupyter Notebooks are the standard tool for data science, machine learning research, and scientific computing workflows. Interactive Document JSON-Based |
TSV
Tab-Separated Values
TSV is a simple tabular data format where columns are separated by tab characters and rows by newlines. It is widely used for data exchange between spreadsheets, databases, and data processing tools. TSV is preferred over CSV in many scientific contexts because tab characters rarely appear in data content, reducing the need for quoting. Tabular Data Tab Delimited |
| Technical Specifications |
Structure: JSON document with cells array
Encoding: UTF-8 Standard: Jupyter Notebook Format v4 (nbformat) MIME Type: application/x-ipynb+json Extension: .ipynb |
Structure: Tab-delimited rows with line breaks
Encoding: UTF-8 or other text encodings Standard: IANA text/tab-separated-values MIME Type: text/tab-separated-values Extension: .tsv, .tab |
| Syntax Examples |
IPYNB uses JSON cell structure: {
"cell_type": "code",
"source": ["import pandas as pd\n",
"df = pd.read_csv('data.csv')"],
"outputs": [{"output_type": "stream",
"text": [" col1 col2\n"]}]
}
|
TSV uses tab characters to separate columns: name age city Alice 30 New York Bob 25 London Charlie 35 Tokyo |
| Content Support |
|
|
| Advantages |
|
|
| Disadvantages |
|
|
| Common Uses |
|
|
| Best For |
|
|
| Version History |
Introduced: 2014 (Project Jupyter)
Current Version: nbformat 4.5 Status: Active, widely adopted Evolution: From IPython Notebook to Jupyter ecosystem |
Introduced: Early computing era (no formal date)
Current Version: IANA registered (text/tab-separated-values) Status: Stable, universally supported Evolution: Simple tabular format predating CSV standardization |
| Software Support |
Primary: JupyterLab, Jupyter Notebook, VS Code
Cloud: Google Colab, AWS SageMaker, Azure Notebooks Libraries: nbformat, nbconvert, papermill Other: GitHub rendering, Kaggle, Deepnote |
Spreadsheets: Excel, Google Sheets, LibreOffice Calc
Languages: Python (csv module), R, Perl, awk Databases: MySQL LOAD DATA, PostgreSQL COPY Other: Any text editor, command-line tools |
Why Convert IPYNB to TSV?
Converting IPYNB to TSV transforms notebook content into a flat tabular structure that can be opened in any spreadsheet application or processed by data tools. Each cell from the notebook becomes a row in the TSV file, with columns for cell type, index, and content. This tabular representation makes it easy to filter, sort, and analyze notebook content using familiar spreadsheet operations.
TSV is preferred over CSV in scientific computing because tab characters rarely appear in source code or documentation text, eliminating quoting complexity. This makes the output cleaner and easier to process with command-line tools like cut, sort, and awk that natively understand tab-delimited fields.
For teams managing large collections of notebooks, converting to TSV enables bulk analysis. You can concatenate TSV files from multiple notebooks and use spreadsheet pivot tables or database queries to analyze code patterns, identify common imports, or track documentation coverage across your entire notebook library.
Key Benefits of Converting IPYNB to TSV:
- Spreadsheet Ready: Open directly in Excel, Google Sheets, or LibreOffice Calc
- No Quoting Issues: Tab delimiters avoid CSV's quoting complexities
- Command-Line Friendly: Process with cut, sort, awk, and other Unix tools
- Bulk Analysis: Concatenate and analyze multiple notebook contents
- Database Import: Load into databases using COPY or LOAD DATA commands
- Scientific Standard: Preferred format in bioinformatics and research
- Clipboard Compatible: Tab-delimited data pastes correctly into spreadsheets
Practical Examples
Example 1: Spreadsheet Data Export to TSV
Input IPYNB file (notebook.ipynb):
{
"cells": [
{
"cell_type": "markdown",
"source": ["# Sales Data Processing\n", "Cleaning and formatting quarterly sales figures."]
},
{
"cell_type": "code",
"source": ["import pandas as pd\n", "df = pd.read_csv('sales_q1.csv')\n", "df['total'] = df['quantity'] * df['price']\n", "print(df.head())"]
}
]
}
Output TSV file (notebook.tsv):
index cell_type source
0 markdown # Sales Data Processing\nCleaning and formatting quarterly sales figures.
1 code import pandas as pd\ndf = pd.read_csv('sales_q1.csv')\ndf['total'] = df['quantity'] * df['price']\nprint(df.head())
Example 2: Bioinformatics Data to TSV
Input IPYNB file (analysis.ipynb):
{
"cells": [
{
"cell_type": "markdown",
"source": ["## Gene Expression Analysis\n", "Processing RNA-seq count matrix from sequencing run."]
},
{
"cell_type": "code",
"source": ["import numpy as np\n", "gene_ids = ['BRCA1', 'TP53', 'EGFR']\n", "counts = np.array([1250, 890, 2340])\n", "normalized = counts / counts.sum()\n", "print(normalized)"]
},
{
"cell_type": "markdown",
"source": ["### Notes\n", "Normalization was performed using TPM method."]
}
]
}
Output TSV file (analysis.tsv):
index cell_type source 0 markdown ## Gene Expression Analysis\nProcessing RNA-seq count matrix from sequencing run. 1 code import numpy as np\ngene_ids = ['BRCA1', 'TP53', 'EGFR']\ncounts = np.array([1250, 890, 2340])\nnormalized = counts / counts.sum()\nprint(normalized) 2 markdown ### Notes\nNormalization was performed using TPM method.
Example 3: Database Export to TSV
Input IPYNB file (research.ipynb):
{
"cells": [
{
"cell_type": "markdown",
"source": ["# User Activity Report"]
},
{
"cell_type": "code",
"source": ["query = 'SELECT user_id, login_count FROM users'\n", "df = pd.read_sql(query, conn)\n", "df.to_csv('users_export.tsv', sep='\\t', index=False)"]
},
{
"cell_type": "code",
"source": ["print(f'Exported {len(df)} user records')\n", "print(f'Average logins: {df.login_count.mean():.1f}')"]
}
]
}
Output TSV file (research.tsv):
index cell_type source
0 markdown # User Activity Report
1 code query = 'SELECT user_id, login_count FROM users'\ndf = pd.read_sql(query, conn)\ndf.to_csv('users_export.tsv', sep='\t', index=False)
2 code print(f'Exported {len(df)} user records')\nprint(f'Average logins: {df.login_count.mean():.1f}')
Frequently Asked Questions (FAQ)
Q: How is notebook content organized in the TSV output?
A: Each notebook cell becomes a row with columns for cell index, cell type (code or markdown), and the cell source content. The header row labels these columns, and you can sort or filter by any column in a spreadsheet.
Q: How are multi-line code cells handled in a single TSV row?
A: Multi-line cell content is preserved within the content column, with internal newlines either escaped or quoted depending on the specific formatting. This ensures each cell remains in a single row for proper tabular structure.
Q: Can I open the TSV file in Excel?
A: Yes, Excel supports TSV files natively. You can open the file directly or use the Text Import wizard to specify tab as the delimiter. Google Sheets and LibreOffice Calc also import TSV files seamlessly.
Q: Why choose TSV over CSV for this conversion?
A: Code content frequently contains commas, which would require extensive quoting in CSV format. Tab characters are almost never present in source code, so TSV provides a cleaner output that is simpler to parse and less prone to formatting errors.
Q: Can I load the TSV into a database?
A: Yes, all major databases support importing tab-delimited data. Use MySQL's LOAD DATA INFILE, PostgreSQL's COPY command, or SQLite's .import mode to bulk-load the TSV content into database tables.
Q: Can I process the TSV with pandas?
A: Absolutely. Use pandas.read_csv('file.tsv', sep='\t') to load the TSV into a DataFrame. This lets you analyze notebook content using all of pandas' powerful data manipulation capabilities.
Q: Are execution outputs included in the TSV?
A: The converter focuses on the source content of cells. Execution outputs are not included in the tabular output, as they are session-specific artifacts. Only the authored code and markdown content is extracted.
Q: Can I combine TSV files from multiple notebooks?
A: Yes, TSV files can be concatenated using simple command-line operations (cat or tail) or merged in a spreadsheet. Adding a filename column helps identify which notebook each cell originated from when combining multiple files.