Supports text and binary files (.txt, .png, .pdf) with custom bit-level manipulation.
- Project Overview
- Key Features
- What is Huffman Encoding?
- Project Architecture
- Technologies Used
- How to Run
- Author
This project is a lossless file compression and decompression system built from scratch in Java. It utilizes the Huffman Encoding algorithm to reduce file sizes by assigning shorter binary codes to frequently occurring bytes and longer codes to less frequent ones.
Unlike simple text-only compressors, this system handles binary files (images, PDFs, executables) by implementing custom bit-level input/output streams, ensuring optimal compression without any data loss.
✅ Lossless Compression: Reduces file size without losing any data quality.
✅ Binary File Support: Works on .txt, .png, .jpg, .pdf, etc.
✅ Custom Bit Manipulation: Implements BitInputStream and BitOutputStream for precise read/write operations.
✅ Exact Restoration: Automatic decompression guarantees the output file is identical to the source.
✅ Command-Line Interface (CLI): Simple and efficient terminal-based usage.
✅ Interview Ready: Demonstrates deep understanding of Greedy Algorithms, Heaps, and Trees.
Huffman Encoding is a popular greedy algorithm used for lossless data compression.
- Frequency Analysis: It counts how often each byte appears in the file.
- Tree Construction: It builds a binary tree where frequent characters are near the top (short paths) and rare characters are at the bottom (long paths).
- Code Generation: It generates unique binary prefix codes based on the tree paths.
Real-world usage: This algorithm is the core concept behind ZIP, GZIP, PNG, and JPEG compression standards.
HuffmanCompressionJava/
│
├── HuffmanNode.java # Tree node representation (stores character & frequency)
├── HuffmanTree.java # Builds the Huffman Tree (Priority Queue implementation)
├── HuffmanEncoder.java # Generates prefix codes from the tree
├── HuffmanDecoder.java # Traverses tree to decode compressed bits
├── BitOutputStream.java # Writes individual bits (buffer management)
├── BitInputStream.java # Reads individual bits from file
├── FileCompressor.java # High-level logic for compressing files
├── FileDecompressor.java # High-level logic for decompressing files
├── Main.java # CLI Entry point
│
├── input.txt # Sample input file
├── output.huff # Compressed binary file
└── README.md
- Language: Java (JDK 8+)
- Core Concepts:
- Data Structures: Priority Queue (Min-Heap), Binary Tree, Hash Maps.
- Algorithms: Greedy Algorithms.
- System: File I/O (
FileInputStream/FileOutputStream). - Low-Level: Bit Manipulation (shifting and masking).
- Design: Object-Oriented Programming (OOP).
- Java JDK 8 or higher installed.
- Command Prompt / Terminal.
Open your terminal in the project directory and run:
javac *.java
To compress a file (e.g., input.txt or image.png):
java Main compress input.txt output.huffResult: Creates a smaller output.huff file.
To restore the original file from the compressed version:
java Main decompress output.huff restored_input.txtResult: restored_input.txt will be identical to the original input.txt.
Akash Odedara
MSc IT | Java Developer | Algorithms Enthusiast