Tuesday, January 27, 2009

Review: iPhoto '09

Every time I think about abandoning iPhoto for something else, there's a new version that has something that appears compelling and I'm using it that much longer.  iPhoto '09 boasts many new features, particularly Faces and Places.   Faces is a facial recognition tool and Places is for geotagging.  Both of these tools are facinating and compelling, but not without flaws.

The Faces feature really has two parts.  The first part the location of human faces within a photo.  This works fairly well, but it will miss many faces.  Low resolution photos, or distance photos are problems simply because this tool requires a number of pixels to recognize that a space on the photo is actually a face.  It also will fail more as a person's face is pointed away from the camera.  The second part of Faces is actual facial recognition.  Once the faces are located on images, you can go in and name the faces and it builds a neural network for each person so it will recognize that person in other photos.  To work well, however, it needs a lot of training, especially you have a lot of photos and you're not a professional photographer who only takes great photos.  Training involves a few approaches.  You can see all the photos Faces thinks is a particular person, and you confirm or reject the recognition.  You can also go into your photos and name the people on the fly.  Either way, it takes time.

One question I have is when you manually define a face, and then assign a person, does it add that info into the neural network?  I hope not.  If Faces can't see the face, even though I assign it, I don't want it messing up the network.

Places allows you to set the location of you photos and allow you to sort through your photos by location.  Assigning locations to photos is fairly easy.  If you don't care about precision, you can set the location by place name.  If you do care, you can use an address or use a Google Map to place the image.  If you want to set the latitude and longitude directly, forget it.  This is the major WTF issue with iPhoto '09.  There might be a way to do set coordinates directly, but I haven't found it.  I was worried that this was true in the Macworld keynote, and I'm disappointed.  I didn't expect them to deal with track logs or those sorts of things, but I am not going to try to click/drag through Google Maps when I have the exact location.  I need to find a solution that's compatible with iPhoto:  Photoshop/Bridge won't do, sadly.  What's with applications that allow you to set all sorts of metadata EXCEPT location?  

An important consideration for these new features is whether they export the new metadata.  Places does, in fact, add GPS and place name information to exported images if you so choose.  Faces, on the other hand, does not export any metadata that I've noticed.  This is a bit disappointing, too.  However, it is possible to then tag all found images with the person's name and that would be added to the metadata.

It also appears that place names and Faces names are searchable in the media brower on iWeb (and thus for all apps using the browser).  I searched for place names I know I added, but did not use a keyword, and they came up in the browser.  Same is true for people I know I added to iPhoto.  I did not see a way, however, to do a spatial search by latitude and longitude.  You would have to use a city name, state, country, etc. if you want different spatial ranges.

With the exception of placing exact coordinates in an image, iPhoto 09 has a lot of potential power.  However, neither of these new features are "free" in these sense that you don't have to do anything to take advantage of them.  It's work, particularly if you have thousands of photos to work through.  We'll see if the effort pays off.


Monday, January 26, 2009

Revamping my data protection plan

Revamping might not be the right word, since I don't have a written plan, but I'm at least re-evaluating what I do to protect my data.  

In the last week, I've been more seriously considering "cloud-based" data storage; that is, storing data on someone else's server out there on the "internets".  The advantage of this is that if my computer is stolen, house burns down, or a tornado hits (there have been 2 near misses over the last few years), data in the cloud would be preserved.   Thus, it can be effective off-site storage solution.

The other off-site storage solution I already use: put your data on some media and physically store the media off site.   This is a great solution, but if something goes wrong, it might take a while to access the media to recover the data.  "Cloud" storage, on the other hand, offers potentially instantaneous access to your data.

A problem with cloud storage (aside from cost considerations) is that you are dependent on the hosting company to maintain security and solvency (i.e. you don't want them to go under).  Another problem is you are limited by bandwidth, either your own connections or the bandwidth allowed by the hosting company.  This limits the practicality of "cloud" storage for some solutions.  For a small  set of documents, such limitations are minor.  For many gigabytes of storage, this becomes a problem.

An attractive service is Amazon's S3.   It's reasonably priced but the costs are not consistent from month to month.  If I put 4 Gb of  data on the site for a year, and never access it after that, it would cost a minimum of $7.60 per month (or about $91 per year).  Mozy, on the other hand, is cheaper and unlimited at the pro level, costing about$60 per year.  However, it is more of a traditional backup solution approach rather than just file storage (I say this without testing, however).  

I'm planning to try Mozy for their  free membership, but it will have to used as a backup solution for a limited amount of my data.  My climate model results will have to remain a physical media offsite storage plan since the cloud is out of reach at the moment.

Monday, January 19, 2009

Martin Luther King Day

Today is the day when Americans recognize the life of Martin Luther King.  The man had a profound impact on this nation, and those impacts will echo throughout the future of America.  

As a white guy, who was less than 2 years old when MLK was assassinated, from a small northern almost all white Ohio town, it took me a while to even begin to appreciate what MLK and like-minded people did for this nation.  I doubt if I can ever fully appreciate it, but as I look out into the world and see ethnic and religious conflict I could see what America could have become.

A few years ago, the death of MLK touched me in a surprising way.  While MLK was doing his civil rights work, America was at war.  And when MLK died, my father was in Vietnam.   Back then, there was no internet, cell phones, blogs, or any of those things.  Along with letters, soldiers and their families would communicate using reel-to-reel magnetic tape.  If you seen the movie Apocalypse Now, one of the men on the boat was playing a tape like that on the river.  Sadly, these were often taped over or burned for security reasons.  However, my parents saved a few.    I spend some time converting these tapes into a digital form to preserve them before they fell apart.  Most of the conversations that we have left are only interesting to my family.  But to my surprise, my father very briefly mentioned the death of "Doctor King".  He didn't say much about it, but it was there.  In terms of MLK history, I'm sure this story is nothing much.  I'm sure everyone had thoughts on MLK's death.  However, it's impact on me was fairly strong because it was my father, my father's voice, and recorded during a war in a far away place.  


Tuesday, January 06, 2009

Is Garbage Collection Needed For NSOperationQueue?


I ran into a brick wall over the last few days.  I've been processing my climate model images using two NSOperation subclasses and NSOperationQueue in MacOSX Leopard.  In my case, the wall was the fact that the Queue simply refused to release my operations until the queue was empty.  If I only had a few operations, this would be okay.  However, I generate about 2000-2200 operations in a fairly short amount of time, and they should take about an hour or so total.  However, when I tried submit them all, the system was completely bogged down by the end of the run and would crash.  It was so bad, I'd have to restart if I wanted to get back to work.

If you're familiar with these objects, you'd think this is because I didn't release the objects after submitting to the queue.   I tripled checked; everything was properly released.  I stuck a NSLog statement in the -(void)dealloc method to see when it was called.  Absolutely none of my operations were dealloced until the job queue was empty.  There are dependencies, however, but eliminating those did not solve the problem and only added complications.

I tried a number of approaches to get the queues to empty.  I created multiple queues, each one to be destroyed after running a number of jobs, but this set up failed.  I created a queue manager that would feed  a single queue after all jobs were submitted - failed.  Frustration!

I then recognized that the problem was likely removing the completed object from an array was an autorelease issue.  That is, the object isn't deleted right away when it's removed from the queue.  Instead, it hangs around for a while until an autorelease pool removes the object.  In my own code, this problem is easily solved by wrapping the code in a pool, between
NSAutoreleasePool *aPool = [[NSAutoreleasePool alloc] init];
   and  
[aPool release];
That usually solves the problem.

In this case, however, the queue is an opaque library class, I can't get in there and add a pool!  In desperation, I started looking up the relatively new garbage collection in Objective-C 2.0.  My hope here would be that garbage collection could be my way of injecting an autorelease pool action into the queue.

My first try was a failure, although some objects were released.  I needed a mechanism to trigger collection.   As it turned out, there was such a mechanism: 
[collector collectIfNeeded];
 I called this method when a change occurred in the number of operations held by the queue changed using KVO.  This worked!  I watched as the jobs ran and finalized!

As it turned out, however, if an object is a dependency for another object, it is still not released after it runs, only after the depended object is released, which makes sense.  However, what this means is that a large number of objects still lingered far longer than they should.  It appears that the queue takes a first come, first serve attitude that skips an operation if it isn't ready and doesn't come back to that object until every other object gets a chance.  The solution for this (which failed without garbage collection) is a second queue that handled all the operations that were dependent on other operations.  Thus, the second queue ended up running these operations far sooner and thus releasing both objects more quickly.   


Finally, I'm ready to move on to the next problem!