Convert IPYNB to TSV

Drag and drop files here or click to select.
Max file size 100mb.
Uploading progress:

IPYNB vs TSV Format Comparison

Aspect IPYNB (Source Format) TSV (Target Format)
Format Overview
IPYNB
Jupyter Notebook

IPYNB is an interactive computational document format used by Jupyter. It stores a sequence of cells containing code, markdown text, and outputs in a JSON-based structure. Jupyter Notebooks are the standard tool for data science, machine learning research, and scientific computing workflows.

Interactive Document JSON-Based
TSV
Tab-Separated Values

TSV is a simple tabular data format where columns are separated by tab characters and rows by newlines. It is widely used for data exchange between spreadsheets, databases, and data processing tools. TSV is preferred over CSV in many scientific contexts because tab characters rarely appear in data content, reducing the need for quoting.

Tabular Data Tab Delimited
Technical Specifications
Structure: JSON document with cells array
Encoding: UTF-8
Standard: Jupyter Notebook Format v4 (nbformat)
MIME Type: application/x-ipynb+json
Extension: .ipynb
Structure: Tab-delimited rows with line breaks
Encoding: UTF-8 or other text encodings
Standard: IANA text/tab-separated-values
MIME Type: text/tab-separated-values
Extension: .tsv, .tab
Syntax Examples

IPYNB uses JSON cell structure:

{
  "cell_type": "code",
  "source": ["import pandas as pd\n",
             "df = pd.read_csv('data.csv')"],
  "outputs": [{"output_type": "stream",
               "text": ["   col1  col2\n"]}]
}

TSV uses tab characters to separate columns:

name	age	city
Alice	30	New York
Bob	25	London
Charlie	35	Tokyo
Content Support
  • Python, R, Julia, and other language code cells
  • Markdown text with rich formatting
  • Code execution outputs and results
  • Inline images and visualizations
  • LaTeX mathematical expressions
  • Cell metadata and tags
  • Kernel information and state
  • Flat tabular data with rows and columns
  • Header row for column names
  • Text values without quoting requirements
  • Numeric and string data fields
  • No nested structures or hierarchies
  • No metadata or typing information
  • Simple and unambiguous parsing
Advantages
  • Interactive code execution with immediate output
  • Combines documentation with executable code
  • Rich visualization and plotting support
  • Supports multiple programming languages
  • Industry standard for data science workflows
  • Version control friendly JSON structure
  • Simpler than CSV (no quoting issues)
  • Opens directly in spreadsheet applications
  • Easy to parse with any programming language
  • Preferred in bioinformatics and genomics
  • Copy-paste friendly with spreadsheets
  • Minimal overhead for tabular data
Disadvantages
  • Requires Jupyter environment to execute
  • Large file sizes with embedded outputs
  • Difficult to diff in version control
  • Non-linear execution can cause confusion
  • Hidden state between cell executions
  • Cannot represent nested or hierarchical data
  • No standard for escaping tab characters in data
  • No data type information (everything is text)
  • Not suitable for complex documents
  • Large datasets produce very large files
Common Uses
  • Data exploration and analysis
  • Machine learning model development
  • Scientific research documentation
  • Educational tutorials and coursework
  • Reproducible research papers
  • Bioinformatics data exchange
  • Spreadsheet data import/export
  • Database bulk loading operations
  • Scientific instrument data output
  • Clipboard data transfer between applications
Best For
  • Data science and machine learning workflows
  • Interactive code exploration and prototyping
  • Reproducible research and analysis
  • Educational tutorials and demonstrations
  • Bioinformatics and scientific data exchange
  • Spreadsheet data import and clipboard pasting
  • Database bulk loading with tab delimiters
  • Command-line data processing with cut and awk
Version History
Introduced: 2014 (Project Jupyter)
Current Version: nbformat 4.5
Status: Active, widely adopted
Evolution: From IPython Notebook to Jupyter ecosystem
Introduced: Early computing era (no formal date)
Current Version: IANA registered (text/tab-separated-values)
Status: Stable, universally supported
Evolution: Simple tabular format predating CSV standardization
Software Support
Primary: JupyterLab, Jupyter Notebook, VS Code
Cloud: Google Colab, AWS SageMaker, Azure Notebooks
Libraries: nbformat, nbconvert, papermill
Other: GitHub rendering, Kaggle, Deepnote
Spreadsheets: Excel, Google Sheets, LibreOffice Calc
Languages: Python (csv module), R, Perl, awk
Databases: MySQL LOAD DATA, PostgreSQL COPY
Other: Any text editor, command-line tools

Why Convert IPYNB to TSV?

Converting IPYNB to TSV transforms notebook content into a flat tabular structure that can be opened in any spreadsheet application or processed by data tools. Each cell from the notebook becomes a row in the TSV file, with columns for cell type, index, and content. This tabular representation makes it easy to filter, sort, and analyze notebook content using familiar spreadsheet operations.

TSV is preferred over CSV in scientific computing because tab characters rarely appear in source code or documentation text, eliminating quoting complexity. This makes the output cleaner and easier to process with command-line tools like cut, sort, and awk that natively understand tab-delimited fields.

For teams managing large collections of notebooks, converting to TSV enables bulk analysis. You can concatenate TSV files from multiple notebooks and use spreadsheet pivot tables or database queries to analyze code patterns, identify common imports, or track documentation coverage across your entire notebook library.

Key Benefits of Converting IPYNB to TSV:

  • Spreadsheet Ready: Open directly in Excel, Google Sheets, or LibreOffice Calc
  • No Quoting Issues: Tab delimiters avoid CSV's quoting complexities
  • Command-Line Friendly: Process with cut, sort, awk, and other Unix tools
  • Bulk Analysis: Concatenate and analyze multiple notebook contents
  • Database Import: Load into databases using COPY or LOAD DATA commands
  • Scientific Standard: Preferred format in bioinformatics and research
  • Clipboard Compatible: Tab-delimited data pastes correctly into spreadsheets

Practical Examples

Example 1: Spreadsheet Data Export to TSV

Input IPYNB file (notebook.ipynb):

{
  "cells": [
    {
      "cell_type": "markdown",
      "source": ["# Sales Data Processing\n", "Cleaning and formatting quarterly sales figures."]
    },
    {
      "cell_type": "code",
      "source": ["import pandas as pd\n", "df = pd.read_csv('sales_q1.csv')\n", "df['total'] = df['quantity'] * df['price']\n", "print(df.head())"]
    }
  ]
}

Output TSV file (notebook.tsv):

index	cell_type	source
0	markdown	# Sales Data Processing\nCleaning and formatting quarterly sales figures.
1	code	import pandas as pd\ndf = pd.read_csv('sales_q1.csv')\ndf['total'] = df['quantity'] * df['price']\nprint(df.head())

Example 2: Bioinformatics Data to TSV

Input IPYNB file (analysis.ipynb):

{
  "cells": [
    {
      "cell_type": "markdown",
      "source": ["## Gene Expression Analysis\n", "Processing RNA-seq count matrix from sequencing run."]
    },
    {
      "cell_type": "code",
      "source": ["import numpy as np\n", "gene_ids = ['BRCA1', 'TP53', 'EGFR']\n", "counts = np.array([1250, 890, 2340])\n", "normalized = counts / counts.sum()\n", "print(normalized)"]
    },
    {
      "cell_type": "markdown",
      "source": ["### Notes\n", "Normalization was performed using TPM method."]
    }
  ]
}

Output TSV file (analysis.tsv):

index	cell_type	source
0	markdown	## Gene Expression Analysis\nProcessing RNA-seq count matrix from sequencing run.
1	code	import numpy as np\ngene_ids = ['BRCA1', 'TP53', 'EGFR']\ncounts = np.array([1250, 890, 2340])\nnormalized = counts / counts.sum()\nprint(normalized)
2	markdown	### Notes\nNormalization was performed using TPM method.

Example 3: Database Export to TSV

Input IPYNB file (research.ipynb):

{
  "cells": [
    {
      "cell_type": "markdown",
      "source": ["# User Activity Report"]
    },
    {
      "cell_type": "code",
      "source": ["query = 'SELECT user_id, login_count FROM users'\n", "df = pd.read_sql(query, conn)\n", "df.to_csv('users_export.tsv', sep='\\t', index=False)"]
    },
    {
      "cell_type": "code",
      "source": ["print(f'Exported {len(df)} user records')\n", "print(f'Average logins: {df.login_count.mean():.1f}')"]
    }
  ]
}

Output TSV file (research.tsv):

index	cell_type	source
0	markdown	# User Activity Report
1	code	query = 'SELECT user_id, login_count FROM users'\ndf = pd.read_sql(query, conn)\ndf.to_csv('users_export.tsv', sep='\t', index=False)
2	code	print(f'Exported {len(df)} user records')\nprint(f'Average logins: {df.login_count.mean():.1f}')

Frequently Asked Questions (FAQ)

Q: How is notebook content organized in the TSV output?

A: Each notebook cell becomes a row with columns for cell index, cell type (code or markdown), and the cell source content. The header row labels these columns, and you can sort or filter by any column in a spreadsheet.

Q: How are multi-line code cells handled in a single TSV row?

A: Multi-line cell content is preserved within the content column, with internal newlines either escaped or quoted depending on the specific formatting. This ensures each cell remains in a single row for proper tabular structure.

Q: Can I open the TSV file in Excel?

A: Yes, Excel supports TSV files natively. You can open the file directly or use the Text Import wizard to specify tab as the delimiter. Google Sheets and LibreOffice Calc also import TSV files seamlessly.

Q: Why choose TSV over CSV for this conversion?

A: Code content frequently contains commas, which would require extensive quoting in CSV format. Tab characters are almost never present in source code, so TSV provides a cleaner output that is simpler to parse and less prone to formatting errors.

Q: Can I load the TSV into a database?

A: Yes, all major databases support importing tab-delimited data. Use MySQL's LOAD DATA INFILE, PostgreSQL's COPY command, or SQLite's .import mode to bulk-load the TSV content into database tables.

Q: Can I process the TSV with pandas?

A: Absolutely. Use pandas.read_csv('file.tsv', sep='\t') to load the TSV into a DataFrame. This lets you analyze notebook content using all of pandas' powerful data manipulation capabilities.

Q: Are execution outputs included in the TSV?

A: The converter focuses on the source content of cells. Execution outputs are not included in the tabular output, as they are session-specific artifacts. Only the authored code and markdown content is extracted.

Q: Can I combine TSV files from multiple notebooks?

A: Yes, TSV files can be concatenated using simple command-line operations (cat or tail) or merged in a spreadsheet. Adding a filename column helps identify which notebook each cell originated from when combining multiple files.