Convert TSV to HEX

Drag and drop files here or click to select.
Max file size 100mb.

Uploading progress:

TSV vs HEX Format Comparison

Aspect	TSV (Source Format)	HEX (Target Format)
Format Overview	TSV Tab-Separated Values Plain text format for storing tabular data where columns are separated by tab characters. Clipboard-native and widely used in bioinformatics, genomics, and data pipelines. Simpler than CSV because tabs rarely appear in data, eliminating quoting issues entirely. Tabular Data Clipboard-Native	HEX Hexadecimal Representation A byte-level representation of file content using hexadecimal notation (base-16). Each byte is displayed as two hex digits (00-FF). Used for debugging, data inspection, binary analysis, and low-level data examination. Essential for understanding file encoding and detecting hidden characters. Binary Data Debugging
Technical Specifications	Structure: Rows separated by newlines, columns by tabs Delimiter: Tab character (U+0009) Encoding: UTF-8, ASCII, or UTF-16 Headers: Optional first row as column names Extensions: .tsv, .tab	Structure: Offset + hex bytes + ASCII representation Base: Base-16 (digits 0-9, letters A-F) Byte Size: 2 hex characters per byte Layout: Typically 16 bytes per line Extensions: .hex, .txt
Syntax Examples	TSV uses tab-separated columns: Name Age City Alice 30 New York Bob 25 London	HEX shows bytes in hexadecimal: 00000000 4e 61 6d 65 09 41 67 65 \|Name.Age\| 00000008 09 43 69 74 79 0a 41 6c \|.City.Al\| 00000010 69 63 65 09 33 30 09 4e \|ice.30.N\| 00000018 65 77 20 59 6f 72 6b 0a \|ew York.\|
Content Support	Tabular data with rows and columns Text, numbers, and dates No quoting needed (tabs rarely in data) Direct clipboard paste support Large datasets (millions of rows) Bioinformatics standard (BED, GFF, VCF)	Exact byte-level representation Reveals hidden/control characters Shows encoding (UTF-8 BOM, line endings) Offset addressing for byte location ASCII sidebar for human readability Works with any file type or encoding
Advantages	No quoting issues (tabs rarely appear in data) Clipboard-native: paste directly from spreadsheets Standard in bioinformatics and genomics Simpler parsing than CSV Human-readable in any text editor Minimal overhead and tiny file size	Reveals exact byte content of any file Essential for debugging encoding issues Shows tab characters (09) and line endings Detects BOM markers and hidden characters Universal -- works with any file format Critical tool for data forensics
Disadvantages	No formatting or styling No data types (everything is text) Tab characters in data can break parsing No multi-sheet support No metadata or schema definition	Not human-readable for data content File size increases significantly (~3x) No semantic structure or meaning Requires hex knowledge to interpret Not suitable for data exchange
Common Uses	Bioinformatics data exchange (BED, GFF) Clipboard copy/paste between apps Database exports and imports Scientific data tables Log file analysis and processing	Debugging file encoding issues Inspecting binary file headers Verifying data integrity Forensic data analysis Detecting hidden characters Reverse engineering file formats
Best For	Clipboard-based data transfer Bioinformatics and genomics workflows Simple tabular data without quoting hassles Unix/Linux data processing pipelines	Debugging TSV encoding problems Verifying tab delimiters are correct Inspecting non-printable characters Low-level data analysis
Version History	Origin: Early Unix tools (cut, paste, awk) IANA Registration: text/tab-separated-values Status: Widely used, stable MIME Type: text/tab-separated-values	Origin: Early computing (hex dumps since 1960s) Tools: xxd, hexdump, od (Unix) Status: Fundamental computing tool Related: Intel HEX, Motorola S-Record
Software Support	Spreadsheets: Excel, Google Sheets, LibreOffice Text Editors: Any text editor (Notepad, VS Code) Programming: Python (csv module), R, pandas Other: Unix tools (cut, awk, sort), databases	CLI Tools: xxd, hexdump, od (Unix/Linux/macOS) Editors: HxD, Hex Fiend, 010 Editor IDEs: VS Code (Hex Editor extension) Programming: Python (binascii), any language

Why Convert TSV to HEX?

Converting TSV to hexadecimal representation reveals the exact byte-level content of your tab-delimited data. This is invaluable for debugging encoding issues, verifying that tab characters are correctly placed, detecting hidden BOM (Byte Order Mark) characters, and inspecting line ending formats (CR, LF, or CRLF). When a TSV file behaves unexpectedly, a hex dump is often the fastest way to identify the root cause.

TSV files are particularly interesting to inspect in hex because you can clearly see the tab delimiter (byte 09) separating each column. This makes it easy to verify that your TSV file is correctly formatted -- you can confirm that delimiters are actual tab characters and not spaces, that line endings are consistent, and that the encoding is what you expect (UTF-8, ASCII, or UTF-16).

In bioinformatics, where TSV is the standard data format, hex inspection is crucial for troubleshooting data pipeline issues. When a genomic data file fails to parse, a hex dump can reveal invisible characters, wrong line endings from Windows/Unix conversion, or encoding mismatches. Our converter produces a clean hex dump with offset addresses and ASCII sidebar for easy analysis.

The hex output follows the standard hex dump format used by tools like xxd and hexdump: each line shows the byte offset, 16 bytes in hexadecimal, and the ASCII representation of those bytes. Non-printable characters (including the tab delimiter 0x09) are shown as dots in the ASCII column, making them easy to spot.

Key Benefits of Converting TSV to HEX:

Debug Encoding: Identify UTF-8 BOM, encoding issues, and character set problems
Verify Delimiters: Confirm tabs (0x09) are used, not spaces or other characters
Inspect Line Endings: Detect CR (0x0D), LF (0x0A), or CRLF differences
Find Hidden Characters: Reveal zero-width spaces, non-breaking spaces, and control chars
Byte-Level View: See the exact binary content of every cell
Standard Format: Output compatible with xxd, hexdump, and hex editors
Data Forensics: Essential for investigating parsing failures and data corruption

Practical Examples

Example 1: Debugging Tab Delimiters

Input TSV file (data.tsv):

ID	Name	Value
1	Alpha	100
2	Beta	200

Output HEX file (data.hex):

00000000  49 44 09 4e 61 6d 65 09  56 61 6c 75 65 0a 31 09  |ID.Name.Value.1.|
00000010  41 6c 70 68 61 09 31 30  30 0a 32 09 42 65 74 61  |Alpha.100.2.Beta|
00000020  09 32 30 30 0a                                    |.200.|

Tab characters visible as 09, line feeds as 0a.

Example 2: Detecting BOM and Encoding

Input TSV file (utf8_bom.tsv):

(UTF-8 BOM file with content:)
Name	City
Anna	Zurich

Output HEX file (utf8_bom.hex):

00000000  ef bb bf 4e 61 6d 65 09  43 69 74 79 0a 41 6e 6e  |...Name.City.Ann|
00000010  61 09 5a c3 bc 72 69 63  68 0a                    |a.Z..rich.|

BOM detected: ef bb bf (UTF-8 BOM)
UTF-8 umlaut u: c3 bc (two bytes for the character 'u' with umlaut)

Example 3: Windows vs Unix Line Endings

Input TSV file (windows.tsv):

(Windows line endings - CRLF:)
A	B
1	2

Output HEX file (windows.hex):

00000000  41 09 42 0d 0a 31 09 32  0d 0a                    |A.B..1.2..|

Windows CRLF line endings visible as 0d 0a (CR+LF).
Unix would show only 0a (LF).
Tab delimiters clearly visible as 09.

Frequently Asked Questions (FAQ)

Q: What is a hex dump?

A: A hex dump is a representation of a file's binary content using hexadecimal notation. Each byte is shown as two hex digits (00-FF). A typical hex dump shows the byte offset on the left, hex bytes in the middle, and ASCII interpretation on the right. This format is used by tools like xxd, hexdump, and hex editors to inspect files at the byte level.

Q: Why would I need to convert TSV to HEX?

A: The most common reason is debugging. When a TSV file won't parse correctly, a hex dump reveals the exact bytes in the file. You can verify that tab characters (0x09) are actual tabs and not spaces, check line endings (LF vs CRLF), detect BOM markers, and find hidden characters that are invisible in text editors. This is especially important in bioinformatics pipelines where data integrity is critical.

Q: How do I identify tab characters in the hex output?

A: Tab characters appear as the hex byte 09 in the output. In the ASCII column on the right side of the hex dump, tabs are typically shown as dots (.) since they are non-printable characters. If you see spaces (hex 20) where you expect tabs, your file may not be a valid TSV file.

Q: What does BOM look like in hex?

A: The UTF-8 BOM (Byte Order Mark) appears as the three bytes EF BB BF at the very beginning of the file. UTF-16 Little Endian BOM is FF FE, and UTF-16 Big Endian is FE FF. BOMs can cause parsing issues in TSV files if the parser doesn't handle them, and a hex dump is the quickest way to detect them.

Q: How can I tell if my TSV has Windows or Unix line endings?

A: In the hex output, Unix line endings appear as 0A (LF - Line Feed), while Windows line endings appear as 0D 0A (CR+LF - Carriage Return + Line Feed). Old Mac line endings use 0D (CR only). Mixed line endings in a single file are a common source of parsing problems, and hex dumps make them easy to identify.

Q: Is the hex output reversible back to TSV?

A: Yes! A hex dump can be converted back to the original binary file using tools like xxd -r (reverse). The hex representation is a lossless encoding of the original data. However, the primary purpose of TSV to HEX conversion is inspection and debugging, not data transformation.

Q: Why does the hex output show dots in the ASCII column?

A: The ASCII column shows a dot (.) for any byte that is not a printable ASCII character (outside the range 0x20-0x7E). This includes tab characters (0x09), line feeds (0x0A), carriage returns (0x0D), null bytes (0x00), and any byte above 0x7E (such as UTF-8 multi-byte sequences). This convention makes it easy to spot non-printable characters.

Q: How does hex dump help with bioinformatics TSV files?

A: Bioinformatics TSV files (BED, GFF, VCF) often pass through multiple processing steps on different operating systems. Hex dumps help identify encoding changes, line ending conversions, and invisible characters introduced during data transfer. When a genomic data pipeline fails, a hex dump of the input TSV is often the first diagnostic step to identify the issue.