Organising 162,356 files using FDUPES and NameMangler

I’m a data hoarder; there I said it.

I’ve been taking digital photos since around 2002; over the years and many computers / hard drives i’ve moved, copied and added to a ‘temp folder’ countless image and video files. In the back of my mind i’ve always said ‘i’ll sort these later’, only to find that weeks and months pass, and as a result, the next batch of images have been added to yet another folder. Periodically I attempted to organise my photos and videos using Adobe Lightroom (and for a time it was really useful); the file renaming feature helped me organise into year, month, day folders (using the EXIF data embedded in the images). Unfortunately I forgot to backup my Lightroom catalog from a Mac during a reformat (dumbass) and after spending so much time organising in Lightroom I felt somewhat defeated.

More months and eventually years pass; my once organised directory of images had hopped around from external drives to cloud storage and back again, and as a result I had duplicated the main folder and in some instances added images and in others not. Disaster.

I finally decided that enough was enough, and began by collecting all my duplicate dirs into a single location. After many hours of copying and migrating images from various sources I had 162,356 image and video files, which totalled 408 Gigabytes. The only way to tackle this first hurdle was to do a diff on each file, keep a singular copy and purge the rest. A quick search pointed me in the direction of FDUPES (https://github.com/adrianlopezroche/fdupes); a program which does exactly what I needed. Now being rather precious about some of my photos I wasn’t about to unleash FDUPES on my main directory, so I opted to create a ‘test’; a sample of duplicate images, nested in various folders and with differing filenames. After several initial trials, I felt comfortable to run this against the core dir.

Command I used was: fdupes -r -d -N /path/to/dir

FDUPES spent a long time building a file list, comparing each file, and after recursively checking through all the files, began the purge. This was a rather nervous moment; watching a terminal window delete thousands of what the program determined to be a ‘dupe’. Once completed, I was left with a fragmented and somewhat sparse core directory, in which some folders were completely empty whilst others contained hundreds of photos. At this point I thought i’d made a massive mistake and had wiped out precious images that I could never recover. Spending some time moving and collating the remaining images from these fragmented directories, my fears were abated; the dedupe had done its task, and I now had one copy of each image. Phew!

I opted to use EXIF image data to rename the files, and another search found me EXIFRenamer (http://www.qdev.de/?location=mac/exifrenamer), which allowed me to drag and drop images to rename them based on a pattern of my choosing (I opted for ‘YYYY-MM-DD-HH-MM-SS’). For the most part, this worked for the vast majority of my images. However, a few files seem to have missing or incorrect EXIF data (don’t ask me how). On closer inspection I found that the original modification dates were displaying the correct date for when the image was taken, but I didn’t have an easy way of using this to rename the file. After some lengthy searching, I came across NameMangler (https://manytricks.com/namemangler/). This utility gave me the option to use the modification date as a parameter for renaming files (it’s limited to renaming 5x files at a time when unlicenced, but for my needs this was ideal, as only a handful of files needed adjustment).

Now I’m the proud owner of 47,235 images, which totals 136Gb; and I promise to keep it up-to-date!

W3TC and .htaccess path manipulation

I had an issue recently with some shared hosting and W3 Total Cache; basically the root path wasn’t being declared correctly via the default configuration of the plugin. I came up with a slight modification to the /inc/define.php file (DISCLAIMER: this is a hack, if you use this your warranty is void / bad things may happen):

Original code:

function w3_get_home_root() {
if (w3_is_network()) {
$path = w3_get_base_path();
} else {
$path = w3_get_home_path();
}
$home_root = w3_get_document_root() . $path;
$home_root = realpath($home_root);
$home_root = w3_path($home_root);
return $home_root;
}

 Modified code:

function w3_get_home_root() {
if (w3_is_network()) {
$path = w3_get_base_path();
} else {
$path = w3_get_home_path();
}
//$home_root = w3_get_document_root() . $path;
//$home_root = realpath($home_root);
//$home_root = w3_path($home_root);
$home_root = '/var/sites/YOURSITE/public_html';
return $home_root;
}

As you can see; it basically involves hard-coding the path to your root directory. 9/10 this won’t affect you but on some shared hosting this problem can occur (you’ll get errors in the W3TC control panel about root paths not being set).

Enabling read/write on external drive shared between OS X and Linux (Ubuntu)

Okay, so i’ve had a 1TB external USB for a while which has been formatted for OS X use (HFS+). I’ve been doing a lot of work on Linux (Ubunut 14.04) recently and wanted an easy, fast way to be able to read/write on both operating systems. I dabbled with this briefly a while back but was unsuccessful, but recently I found a little trick to enable read / write on both OS’s! This may seem like basic setup to some, but judging from the amount of forum posts i’ve read on this, it does seem to fox a lot of people…

NOTE: there is a caveat to this; I don’t share this drive with any other computers. If you move a drive between machines then this solution isn’t for you. Read on to find out how to enable this!

Continue…

Adalight, Arduino, Boblight, Ubuntu and XBMC

After seeing my friend build his own Ambilight-esque LED backlight for his TV, and him kindly giving me a spare set of LEDs to construct my own (he bought too many sets); I’ve gotta say…

IT’S AWESOME:

If you’re looking to build your own; its not immensely straightforward but if you know your way around Arduino and the Ubuntu terminal, then this guide will help.
Continue…

Automount software RAID array in Ubuntu

I’ve spent ages trying to get automounting of a RAID array to work; and after many failings I now have it working, so I figured now would be a good time to document the process.

Now i’ll assume you have a working software RAID array (or a standard RAID Array) which you can manually mount.

Step 1 – Get the correct UUID

A lot of the guides to this tell you to find the UUID of the RAID array by using ‘mdadm –detail -scan’ – this is WRONG! It took me a while to find that the correct UUID for your array can be found by using the blkid command (blkid is a command-line utility to locate/print block device attributes):

blkid /dev/md127

Once you have the correct UUID you can move on…
Continue…

ReSrc.it – Responsive images done right

The image you’re about to see is not residing on my server; it’s being served by ReSrc.it – a new cloud-based service which will actively deliver the optimum-sized image for your device. Don’t believe me? resize the browser window (so that the image size is reduced); reload the page and check out the source. you should be getting a new image every time the window is resized!

Sunset in Norfolk - ReSrced!

This is a massive leap forward for responsive design; it means that the right image, at the right resolution, at the right time can be delivered to your device. ISPs and Network providers are going to love this (as it saves on bandwidth); Web designers, photographers, content producers and anyone who works with images online have been crying out for this sort of a service since the whole ‘responsive web’ thing started.

I encourage you to check out their demo page (they have a load of image effects and switches to alter the loaded image) and if you think its awesomesauce then register for the Beta programme!

Working with Twitter Bootstrap on OS X

After messing with some of Twitter’s Bootstrap files locally, I decided it was time to get a build environment established on my Mac. It quickly became apparent that there wasn’t a whole lot of guidance on setting up the build environment in OS X; so i’ve posted my efforts here so that someone in my shoes can follow these steps:

Step 1: Install Node (and Node Package Manager)
Visit http://nodejs.org/ and download the installer for OS X (Lazylink: http://nodejs.org/dist/v0.8.12/node-v0.8.12.pkg)

Step 2: Clone and build Less.js
Install Less.js via folowing command (Assumes GIT CL Tools is installed):
git clone git://github.com/cloudhead/less.js.git

Next, go to the less.js directory, type: ‘make’.

Then copy the install directory to /usr/local/less.js

Add ‘export PATH=$HOME/local/less.js/bin:$PATH’ to ~/.bash_profile:

Open a terminal and type:

touch ~/.bash_profile; open ~/.bash_profile

Paste following in:

export PATH=$HOME/local/less.js/bin:$PATH

and save…

Step 3: Install Bootstrap Dependencies
Twitter Bootstrap depends on a number of packages; you can install all of them by using the following command with NPM:

npm install uglify-js less jshint recess -g

Step 4: clone and make Twitter Bootstrap
Clone Bootstrap:
git clone git://github.com/twitter/bootstrap.git

Go to the bootstrap directory, type: ‘make’.
Built files are output in the ‘docs’ directory…

Enjoy!

Pragmatic Responsive Web Design

Following on from my last post; i’ve been doing some spleunking into ‘responsive design’ – the latest buzz term circling the web design / development camp. I’ve stumbled across a rather insightful slideshare presentation from Brian and Stephanie Rieger; they worked on browser.nokia.com at Breaking Development conference. For that project, they invented a new way to combine client side information with device detection. It’s a really interesting approach and certainly worth assimilating!
Continue…