Archive for May, 2009

Opening Winmail.dat files on a Mac

Sunday, May 31st, 2009

One fairly persistent issue I have is some people I know insist on sending me emails from Windows mail (apparently Outlook is the worst offender) with the ‘Use Windows mail format’ option checked. This anti-social behaviour results in non-Windows users receiving attachments packaged up into a file called winmail.dat which they can’t open. Apparently this is just a wrapper around standard attachments that could be handled ok but MS prefer to use their own ’standards’ instead. Sometimes you can ask the mail sender to send the message in a standard format, but this is not always possible. I have done a little investigation and have discovered a few ways to deal with the problem.

Firstly, the OSX Mail app has an option to try viewing messages as alternate types – the option is under View|Message|Next Alternative option. This often manages to decode an attachment into plain text. If this fails there is a free application called Enough that you can drop the winmail.dat file onto and it seems to do a pretty good job of extracting the attachments from it. And finally there is a plug-in for Mail called Letter Opener which is expensive but handles the conversion inside Mail and make the attachments appear where they should be in the message. I’m guessing it’s expensive because it does a bunch of other things too – calendar conversion for one. It looks like it would be fairly to implement the attachment processing functionality on its own – I’m adding it to my to-do list of things to do on a quiet evening sometime.

Apple Blocking Access to the AppStore?

Wednesday, May 27th, 2009

I got a chance to investigate the strange access problems I found last week trying to scrape the iPhone Appstore. It looks to me like something has definitely changed on the server, but it’s hard to see what. My original script used curl with its default user string, and that seemed to suffer timeout problems on every page. So I by changed the agent string to the firefox one, which seemed to result in an immediate improvement, but it too started to suffer slowdowns as it progressed through the store. Finally I changed all timeouts to 5 minutes and it looks like every call returned successfully. So all I can think of is that requests that aren’t from the iPhone or iTunes are being served, but they’re being sent to the back of the queue. I don’t see much logic in all this, but the Apple do move in mysterious ways sometimes, and it’s always possible it’s a quirk rather than a policy decision. But the good news is that access is still being permitted at some level and we can go on cutting and dicing appstore content into something useful.

$100 free credit at go-grid

Saturday, May 23rd, 2009

I’ve been meaning to post this for about a week now, but I’m sure the information is still accurate. GoGrid have a $50 coupon on their site that’s easy to find and works out at about a month’s free basic service. I got an email from them saying that to get a $100 credit you just need to call 877.946.4742 or +1 415.869.7444 and ask for one. I guess they think the extra details they can grab from you in person is worth the extra cash. If they ask where you got the number say you got forwarded the StartupSF email.

Apple blocking curl from the Appstore?

Tuesday, May 19th, 2009

Not quite sure what’s going on with the AppStore. I just resumed my experiments and it appears that a couple of things have changed. Firstly calls from curl seem to be blocked – although changing the user agent seems to get round that. Why they would impose such a trivially bypassed hurdle is a bit of a mystery – surely if there is a target of a block there are better ways to keep them out, like ip address blocking. It is interesting that they aren’t moving to impose a total block from non-iTunes clients though, clearly that is a tacit admission that they are allowing store scraping at some level. More seriously, some of the browse URLs I was using previously don’t appear to work any more. I’m sure I can figure out what’s going on but I’m going to need more time than I have now to investigate. I’ll post back as soon as I figure it out.

[ad#co-1]

iTunes AppStore scraping – decoding the browse URL

Monday, May 11th, 2009

Further to my recent posts covering scraping the itunes appstore – I have made some progress towards decoding the browse URL that returns the list of apps by category. There is a slight wrinkle with categories that have sub-categories (currently only games) and a potential work-around to the 3500-per-page limit.

The browse URL breaks down to this:

http://ax.itunes.apple.com/WebObjects/MZStore.woa/wa/browse?path=/category/subcategory/page

The top level browse URL, ie

http://ax.itunes.apple.com/WebObjects/MZStore.woa/wa/browse

on its own gives a list of top level categories and their associated ids- eg TV shows is 32, Music videos is 31, Music is 34 and AppStore is 36.

So to browse a category from the root, you append the URL with the query string path=/id. Ie the AppStore URL is

http://ax.itunes.apple.com/WebObjects/MZStore.woa/wa/browse?path=/36

which returns a list of AppStore categories and their ids – Weather = 6001, Travel = 6003, Games = 6014, etc.

Then, to browse all weather apps the URL is

http://ax.itunes.apple.com/WebObjects/MZStore.woa/wa/browse?path=/36/6001/1

where the final 1 seems to be a paging control – so where there are > 3500 apps you can increment the last number to retrieve the next set of app details.

Where there are subcategories, they can be accessed by replacing the top level id with that of the category – so to browse all games subcategories the URL is

http://ax.itunes.apple.com/WebObjects/MZStore.woa/wa/browse?path=/6014

which returns the names and ids of the games subcategories (Action = 7001, Adventure = 7002, and so on). Then to browse the action games the URL becomes

http://ax.itunes.apple.com/WebObjects/MZStore.woa/wa/browse?path=/6014/7001/1

It looks to me that currently if the tree is traversed from the root until the list of subcategories returns an empty list, and then the leaf node is used to retrieve the apps, there are no need for paging with a value of greater than 1. This is also the only method I can see for determining which subcategory an app is listed under – the apps themselves link to the category and a genre but not a subcategory. I also don’t know right now if this will produce multiple instances of the same app – ie if an app can appear under multiple subcategories.

[ad#co-1]