Convert ADOC to TSV
Max file size 100mb.
ADOC vs TSV Format Comparison
| Aspect | ADOC (Source Format) | TSV (Target Format) |
|---|---|---|
| Format Overview |
ADOC
AsciiDoc Markup Language
Lightweight markup language designed for writing technical documentation, articles, books, and other structured content. Created by Stuart Rackham in 2002, AsciiDoc uses plain text syntax that can be converted to HTML, PDF, EPUB, and other formats. Known for its readable source format and powerful features for documentation. Documentation Format Plain Text |
TSV
Tab-Separated Values
Plain text format for storing tabular data where values are separated by tab characters. Similar to CSV but uses tabs instead of commas as delimiters. TSV is particularly useful when data contains commas, as it avoids the need for quoting. Widely supported by spreadsheet applications, databases, and data processing tools. Data Format Tab-Delimited |
| Technical Specifications |
Structure: Plain text with markup syntax
Encoding: UTF-8 (recommended) Format: Human-readable markup Compression: None (plain text) Extensions: .adoc, .asciidoc, .asc |
Structure: Rows and columns (tabular)
Encoding: ASCII, UTF-8, or locale-specific Format: Plain text with tab delimiters Compression: None (plain text) Extensions: .tsv, .tab, .txt |
| Syntax Examples |
AsciiDoc table syntax: .Employee Directory |=== |Name |Department |Email |John Smith |Engineering |[email protected] |Jane Doe |Marketing |[email protected] |=== |
TSV equivalent (tabs between columns): Name Department Email John Smith Engineering [email protected] Jane Doe Marketing [email protected] |
| Content Support |
|
|
| Advantages |
|
|
| Disadvantages |
|
|
| Common Uses |
|
|
| Best For |
|
|
| Version History |
Introduced: 2002 (Stuart Rackham)
Current Version: AsciiDoc 2.0 (Asciidoctor) Status: Actively developed Evolution: Asciidoctor is modern implementation |
Introduced: 1960s (computing era)
Standardized: IANA registered (text/tab-separated-values) Status: Stable, widely used Evolution: Minimal changes, simple spec |
| Software Support |
Asciidoctor: Primary processor (Ruby/Java/JS)
IDEs: VS Code, IntelliJ, Atom plugins Editors: AsciidocFX, AsciiDoc Live Other: GitHub, GitLab rendering |
Microsoft Excel: Full support (import/export)
LibreOffice Calc: Full support Google Sheets: Import support Other: Unix tools (cut, awk), Python, R |
Why Convert ADOC to TSV?
Converting AsciiDoc documents to TSV (Tab-Separated Values) format is ideal when you need to extract tabular data that may contain commas within the values. Unlike CSV which uses commas as delimiters, TSV uses tab characters, making it perfect for data that includes addresses, descriptions, or any text with embedded commas. This eliminates the need for complex quoting rules and makes the data cleaner to process.
AsciiDoc tables often contain technical specifications, pricing information with currency symbols, addresses, or descriptive text that naturally includes commas. By converting to TSV, you preserve all this data without worrying about delimiter conflicts. Many scientific and bioinformatics tools prefer TSV format because it handles complex text data more reliably than CSV.
TSV files work seamlessly with command-line tools in Unix/Linux environments. Tools like cut, awk, and paste are designed to work with tab-delimited data by default. This makes TSV an excellent choice when you need to process data with shell scripts or integrate with data pipelines. The conversion from AsciiDoc documentation to TSV bridges the gap between human-readable documents and machine-processable data.
The tab character serves as a natural delimiter because it rarely appears in normal text content. This makes TSV parsing simpler and more reliable than CSV, which requires handling quoted fields, escaped commas, and various edge cases. For data interchange between systems, TSV often provides fewer surprises and cleaner integration.
Key Benefits of Converting ADOC to TSV:
- Comma-Safe: Data containing commas doesn't require escaping or quoting
- Clean Parsing: Simpler parsing rules than CSV with fewer edge cases
- Unix-Friendly: Works natively with command-line tools (cut, awk, sort)
- Scientific Standard: Preferred format in bioinformatics and research data
- Copy-Paste Ready: Tabs align columns when pasting into spreadsheets
- Readable Format: Columns align visually in text editors with monospace fonts
- Wide Support: Compatible with Excel, Python, R, and databases
Practical Examples
Example 1: Contact Directory with Addresses
Input AsciiDoc file (contacts.adoc):
= Company Contacts .Office Locations |=== |Office |Address |Phone |Headquarters |123 Main Street, Suite 400, New York, NY 10001 |+1 (555) 123-4567 |West Coast |456 Pacific Avenue, San Francisco, CA 94102 |+1 (555) 987-6543 |Europe |78 High Street, London, UK EC2A 4AA |+44 20 7123 4567 |===
Output TSV file (contacts.tsv):
Office Address Phone Headquarters 123 Main Street, Suite 400, New York, NY 10001 +1 (555) 123-4567 West Coast 456 Pacific Avenue, San Francisco, CA 94102 +1 (555) 987-6543 Europe 78 High Street, London, UK EC2A 4AA +44 20 7123 4567 Benefits: - Addresses with commas preserved without quoting - Easy import into Excel, Google Sheets - Ready for database import - Command-line processing with cut/awk
Example 2: Product Catalog with Descriptions
Input AsciiDoc file (products.adoc):
.Product Specifications |=== |SKU |Name |Description |Price |WDG-100 |Pro Widget |Heavy-duty, weather-resistant, multi-purpose widget |$149.99 |GDT-200 |Smart Gadget |Bluetooth-enabled, rechargeable, compact design |$89.99 |ACC-300 |Deluxe Accessory Kit |Includes cables, adapters, carrying case, and manual |$49.99 |===
Output TSV file (products.tsv):
SKU Name Description Price WDG-100 Pro Widget Heavy-duty, weather-resistant, multi-purpose widget $149.99 GDT-200 Smart Gadget Bluetooth-enabled, rechargeable, compact design $89.99 ACC-300 Deluxe Accessory Kit Includes cables, adapters, carrying case, and manual $49.99 Benefits: - Descriptions with commas handled cleanly - Price data ready for analysis - Import into inventory systems - Create reports from documentation data
Example 3: Research Data Table
Input AsciiDoc file (research.adoc):
== Experiment Results .Sample Analysis Data |=== |Sample ID |Compound |Concentration (mg/L) |Notes |S001 |Glucose, Fructose |12.5 |Control sample, baseline |S002 |Glucose, Fructose |15.8 |Treatment A, day 1 |S003 |Sucrose, Maltose |8.3 |Treatment B, day 1 |===
Output TSV file (research.tsv):
Sample ID Compound Concentration (mg/L) Notes S001 Glucose, Fructose 12.5 Control sample, baseline S002 Glucose, Fructose 15.8 Treatment A, day 1 S003 Sucrose, Maltose 8.3 Treatment B, day 1 Benefits: - Scientific data preserved accurately - Compound names with commas intact - Ready for R, Python pandas analysis - Standard format for bioinformatics tools
Frequently Asked Questions (FAQ)
Q: What is the difference between TSV and CSV?
A: TSV (Tab-Separated Values) uses tab characters as delimiters between columns, while CSV (Comma-Separated Values) uses commas. TSV is better when your data contains commas (like addresses or descriptions) because you don't need to quote fields. CSV is more common but requires escaping or quoting when data contains commas.
Q: Can I open TSV files in Excel?
A: Yes! Microsoft Excel fully supports TSV files. You can open them directly (Excel may auto-detect the tab delimiter) or use File > Import with the Text Import Wizard. Google Sheets and LibreOffice Calc also support TSV import. The data will be properly separated into columns based on the tab delimiters.
Q: Why choose TSV over CSV for this conversion?
A: Choose TSV when your AsciiDoc tables contain text with commas, such as addresses, descriptions, or lists within cells. TSV handles these naturally without needing quotes or escape characters. It's also preferred for scientific data, command-line processing with Unix tools, and when you need cleaner, more readable raw data files.
Q: How does the converter handle AsciiDoc table formatting?
A: The converter extracts the text content from AsciiDoc tables, removing markup syntax like column specifications and formatting attributes. Each table row becomes a TSV row, and cell contents are separated by tabs. Headers from the first row are preserved. Table titles and captions are not included in the TSV output.
Q: What if my AsciiDoc data contains tab characters?
A: Tab characters within cell data are rare but would conflict with the TSV delimiter. The converter will either replace tabs with spaces or escape them appropriately. In practice, this is uncommon since AsciiDoc content typically doesn't include literal tab characters in table cells.
Q: Can I process the TSV output with command-line tools?
A: Absolutely! TSV is the native format for many Unix/Linux tools. You can use 'cut -f2' to extract the second column, 'awk -F'\t'' for field processing, or 'sort -t$'\t' -k2' for sorting. Python's csv module and R's read.delim() function also handle TSV natively.
Q: Is TSV suitable for database imports?
A: Yes! Most databases support TSV imports. MySQL's LOAD DATA INFILE, PostgreSQL's COPY command, and SQLite's .import all accept tab-delimited files. TSV is often preferred for imports because it avoids the quoting complexity of CSV, especially for text-heavy data.
Q: What encoding is used for the TSV output?
A: The converter outputs UTF-8 encoded TSV files, ensuring support for international characters, special symbols, and any text from your original AsciiDoc content. Most modern applications and programming languages handle UTF-8 by default, making the output widely compatible.