Command Line Tool and Library To Eliminate Duplicates and Facilitate Intelligent Merging of Data Structures.
LITEN IS IN VERY ACTIVE DEVELOPMENT
Installation
You can use setuptools, via easy_install, to install the module, and the script to your bin directory automatically. You can download setuptools ez_setup.py here:
http://
Just easy_install the egg for your version of python, either python2.4, or python2.5
Easy Install Example:
easy_install-2.4 http://
or
easy_install liten (This downloads the 2.5 egg from the cheesehop)
You can also visit the cheeshop as well: http://
RPM Version Example:
Download RPM (Requires Python 2.4 or Python 2.5): http://
Install:
rpm -ivh liten-0.
Debian/Ubuntu Package Version
Download the .deb (Requires Python2.5): http://
Note: This should work on almost any Ubuntu or Debian system with Python 2.5
Install:
dpkg -i liten-0.
Download the Script
If you are in a huge rush, or don't have a package for your distribution, then you can just download liten.py and run it:
Do:
wget http://
or perhaps:
curl http://
Version: 0.1.3 Description
A deduplication command line tool and library. A relatively efficient algorithm based on filtering like sized bytes, and then performing a full md5 checksum, is used to determine duplicate files/file objects.
Example CLI Usage:
liten.py -s 1 /mnt/raid is equal to liten.py -s 1MB /mnt/raid
liten.py -s 1bytes /mnt/raid
liten.py -s 1KB /mnt/raid
liten.py -s 1MB /mnt/raid
liten.py -s 1GB /mnt/raid
liten.py -s 1TB /mnt/raid
Example Library Usage:
Currently Liten is optimized for CLI use, but more library friendly changes are coming.
>>> Liten = LitenBaseClass(
>>> dupeFileOne = 'testData/
>>> checksumOne = Liten.createChe
>>> dupeFileTwo = 'testData/
>>> checksumTwo = Liten.createChe
>>> nonDupeFile = 'testData/
>>> checksumThree = Liten.createChe
>>> checksumOne == checksumTwo
True
>>> checksumOne == checksumThree
False
Tests:
Run Doctests: ./liten -t or --test
Run test_liten.py
Display Options:
STDOUT: stdout will show you duplicate file paths and sizes such as:
Printing dups over 1 MB using md5 checksum: [SIZE] [ORIG] [DUP]
7 MB Orig: /Users/
Dupe: /Users/
REPORT:
A report named LitenDuplicateR
Duplicate Version, Path, Size, ModDate
Original, /Users/
Duplicate, /Users/
DEBUG MODE ENVIRONMENTAL VARIABLES:
To enable print statement debugging set LITEN_DEBUG to 1
To enable pdb break point debugging set LITEN_DEBUG to 2
LITEN_DEBUG_MODE = int(os.
Note: When DEBUG MODE is enabled, a message will appear to standard out
[http://
View full history Series and milestones
trunk series is the current focus of development.