Monday, January 31, 2005

More on CAPTCHA

OK, so I stirred up a few bees in the last post :)

 

I would have VERY much preferred if it (captcha as implemented in major blogging engines) were more solid. But it *cant* be. Blogs are made to be open and accessible. Thats the whole point, the ability to post comments is a big part of that. Like you say, blogs without comments enabled are highly inconvenient.. they are nothing more than traditional "programming" - one way communication. So, we *want* them to be accessible. Obviously, captcha throws a speedbump in that, the whole idea of captcha is to be a speedbump.

And that would be OK *if* it actually WERE a speedbump. As soon as I saw the implementation (always 6 letters, easy contrast, fixed character set - actually it's plainly just 3 hex values, background always a fixed pattern) I knew instinctively that it would be incredibly easy to circumvent. I don't even think you need an AI engine to do it... I can think of a few algorithms that have zero Neural Network components that could achieve a 50% success rate. My point is this is just a very brief escalation in an arms race, which buys us very little time but throws up a significant deterrent to valid comment posters.

I would also point out that as far as I can tell, the Web Service comment APIs do not even support it. And these are the most visible and obvious spammer attack vector. After all, they are using automated tools... its easier to just call a web service than scrape a web page exchange. And of all the blogs I read, only two that I can think of right now (Shawn's and Daily WTF) have proprietary comment submission API (or no WS API) that is not exposed here.

This guy's work is not going to give spammers any kind of leg-up. You can bet they were well on the case long before... it's just too enticing for them. They are fully willing to send out email spam just to get the one in 10,000 who will click the links... blogs are a gold mine compared to that... even the comment links. As far as I know, they still havent breached it yet (en masse), but it's only a matter of time.

The reason I say I think this guy did a good thing here is that he is making it very clear to his peers (he only spammed MVP's) that they do not have as much security as they think they do with this. It's easy to get mad at him for "opening the pandoras box", but he didn't really open it so much as he said "hey, everyone open your eyes, the box is already open!".

I would personally like to see a real solution to the spam problem (I hate it as much as anyone else). But this wasn't it. This was too easy to circumvent in an automatable way... and once a spambot is made for ONE dasBlog site, it works for ALL of them, same goes for the other platforms.

The only way to give yourself real relief is to implement your own captcha (or other confirmation process) so that you won't become vulnerable by virtue of your platform. It's one situation where "roll your own" security may actually be better - since we dont have a "standard" that is solid enough. Thats too much work for me though (maybe for others it is not), I just dont have the time for it. It's less hassle to just delete the garbage as it comes in at this point.

Monday, January 31, 2005 9:27:44 AM (Eastern Standard Time, UTC-05:00) #  Disclaimer | Comments [3] | 

 Sunday, January 30, 2005

That didn't take long: CAPTCHA is irrelevant

As soon as I heard dasBlog and other blog engines had implemented CAPTCHA tests on the comment forms, I knew it was only a matter of time before it was shown to be irrelevant. But Casey Chesnut has defeated it even quicker than I thought would happen.

He simply spent a few minutes studying the implementation, figuring out it's limits and patterns, then wrote a bit of AI to decypher the images. His routine only has a success rate of 50%, but thats more than sufficient to perform comment spam. To prove his point, he ran it against over 90 blogs on msmvps.com.

This is the precise reason I never even considered enabling this feature of dasBlog. It is easily circumventable, and really does nothing more than interfere with your real readers' ability to contribute.

This emperor has no clothes.

Hats off to you Casey, this false security needed to be called out.

 

Sunday, January 30, 2005 3:00:15 PM (Eastern Standard Time, UTC-05:00) #  Disclaimer | Comments [3] | 

 Saturday, January 29, 2005

OK RSS Bandit is starting to grate on my nerves

It's just not threaded properly. It can't be.

I noticed long ago that it uses worker threads to update feeds, and often when a feed fails to update, that worker thread (most likely a pool thread) becomes a zombie. Over time, this further degrades performance until the pool just runs out of available threads (not to mention the memory consumption is just insane by then). The only remedy is to shut down the program completely and restart it... giving you a day or two before having to kill it again.

And now this - I have noticed what can only be described as "anomalies" in my downloaded feeds before... things like seeing comments attached to the wrong posts, and so on. I always assumed this was just due to bugs in the RSS feed producer I was subscribed to. But today, on a pair of blogs, both had a post with a title of "{...}". Each had a jumble of intermixed text from recent posts in the other blog. In order to get the feeds sorted out properly, I had to delete them from my opml, restart RSS Bandit, and add them back. And of course, there are no posts on either blog titled "{...}".

I suppose the corrupted posts could be due to buggy local storage code, but it sure looks a whole lot like the updater threads are just poorly monitored and synchronized.

I guess it's time to try NewsGator. I am not thrilled about packing my RSS feeds into Outlook, but guess I will get used to it.

At least they have a free trial period though. I hate paying for something and tossing it out shortly after.

 

Saturday, January 29, 2005 4:48:45 PM (Eastern Standard Time, UTC-05:00) #  Disclaimer | Comments [6] | 

 Friday, January 28, 2005

Add three more WS acronyms to your vocabulary

Three new specs were released as W3C recommendations today...XML-binary Optimized Packaging (XOP), SOAP Message Transmission Optimization Mechanism (MTOM), and Resource Representation SOAP Header Block (RRSHB).

They are all interrelated... XOP is the XML mechanism that is the building block for MTOM (SOAP implementation of XOP which does not dictate wire protocol binding), and RRSHB is an expansion of the SOAP HTTP binding to support MTOM/XOP.

In a nutshell, all of this stuff has one essential purpose: Freeing you from having to encode binary data into base64 when you serialize a SOAP message. Basically, base64 is a really heavyweight encoding scheme, that adds a TON of fat. By allowing binary octet data to actually be transmitted unencoded (and under certain circumstances, relayed across SOAP processors unencoded) instead of taking the serious encoding hit, the bandwith requirements of moving large messages around is reduced significantly. Not only that, but the CPU overhead of doing the base64 encoding/decoding is also alleviated. And yet another, though smaller, benefit is that less buffer resources are required on both ends of the pipe (assuming both ends can deal with the data in it's raw format). Of course, some operations may require access to the data in base64, so a SOAP sender or reciever may have to encode/decode it anyways (I think XML-Signature requires binary data to be addressable in base64?)... but it still gets the bandwidth reduction when it is transmitted over the wire.

I can see a lot of benefit to this.

For example, this makes digital media transmission over web services a whole lot more appealing (as opposed to scrapping together a proprietary protocol). SOAP-routed media could turn out to be quite interesting. I know folks have tried to do it with Web Services, and gave up in frustration due to the bandwidth hit... but now the hit isnt there... it's now basically raw media data wrapped in some dandy XML metadata... fascinating...

Another use I can see is just around the corner when ADO.NET 2.0 datasets support REAL binary serialization instead of the "serialize to pseudo-xml, and call it 'binary'" serialization of ADO.NET 1.x. In a purist sense, we know it's bad form (argue it here, here, and here) to send datasets over web services "Data on the Inside vs. Data on the Outside" and all that. But I tend to agree with the opinion that in most cases once we commit to sending a dataset in a message, it becomes "opaque" like an image or any other binary data would be... and doesn't really break the purist view (come on, squint real hard with me). So passing a large dataset in binary format is perfectly acceptable to me, and these new specs will make it significantly more viable. One of the major debate points against datasets over webservices has been the bloat it causes... but with binary serialization and XOP/MTOM, that arguement fades fast.

Of course like anything else, I am sure someone somewhere will find a way to abuse the specs and I will be griping about it later.

 

Friday, January 28, 2005 2:09:25 AM (Eastern Standard Time, UTC-05:00) #  Disclaimer | Comments [0] | 

 Thursday, January 27, 2005

New application blocks coming

The word this week is that on Friday there should be a set of new application blocks released on the MSDN patterns and practices site. We can expect a new Data Access and Logging block, as well as a Cryptography block, and a few other treats. Sounds like a pretty good pack of reusable blocks, so I for one will be keeping my eye on MSDN tomorrow.

 

[update:] The new blocks are now online at this url: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnpag2/html/entlib.asp

 

Thursday, January 27, 2005 3:05:53 PM (Eastern Standard Time, UTC-05:00) #  Disclaimer | Comments [3] | 

Volkswagon parody ad

Thursday, January 27, 2005 1:47:21 PM (Eastern Standard Time, UTC-05:00) #  Disclaimer | Comments [0] | 

 Monday, January 24, 2005

"library not registered" when creating new c# project item or project

What a stupid and obscure bug in VS.NET 2003. Thankfully someone figured it out and posted the fix on their blog (which didn't take all that long to find in Google). Making a note of it here in case it happens again. Who would have ever thought a DLL from the Visual C++ directory tree in Visual Studio would get unregistered simply by running a VS add-in deployment package install/uninstall a few times... and then would keep C# project wizards from working??

At least the fix is very simple and quick:

regsvr32 "C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\vcpackages\csproj.dll"

Yes, thats cSproj.dll in the VC7 folder.

 

Monday, January 24, 2005 1:16:19 AM (Eastern Standard Time, UTC-05:00) #  Disclaimer | Comments [3] | 

 Thursday, January 20, 2005

Atlanta .NET Book Club

Brendon and Matt, The Atlanta .NET Regular Guys are going to resume this monthly event. It was put on hold for the last two months mainly due to holidays. It's a pretty laid-back meeting, often with very few attendees. In fact, it would be a record if some more folks swung by and raised our attendance to, say, 5.

In any case, tonight is the meeting for January. At 6:30 PM, you can find us at the 5 Seasons Brewery (Roswell Rd, just inside 285, in the back of the "Prado" complex). I am told they actually have a free WiFi access point there, so if you get bored with the discussion then you can always just surf some p0rn.

Swing by and join us!

 

Thursday, January 20, 2005 1:48:23 AM (Eastern Standard Time, UTC-05:00) #  Disclaimer | Comments [0] | 

 Wednesday, January 19, 2005

Cool amateur-processed Titan images

Nature.com is carrying a story of how amateur astronomers are actually delivering processed raw images before NASA/ESA. There are some really cool images coming out from these folks, including inferred topographical maps and stitched panoramas.

A good site with an index of the amateur images can be found at http://anthony.liekens.net/huygens_static.html

 

Wednesday, January 19, 2005 5:52:26 PM (Eastern Standard Time, UTC-05:00) #  Disclaimer | Comments [0] | 

 Tuesday, January 18, 2005

dasBlog version 1.7

Thanks Michael for pointing out the new dasBlog release 1.7.

So far it appears to run much smoother and faster, and the referrer blacklist looks promising as well.

My favorite feature though has got to be the new "Internet Searches" section in the Referrer Log admin page. Now I can see (more easily) what kind of google searches are finding me, and therefore what post topics draw the most interest.

 

Tuesday, January 18, 2005 9:13:44 PM (Eastern Standard Time, UTC-05:00) #  Disclaimer | Comments [0] | 

 Monday, January 17, 2005

Florida Code Camp

Just bookmarking this so I don't lose the URL again... Florida Code Camp

 

Monday, January 17, 2005 3:48:09 PM (Eastern Standard Time, UTC-05:00) #  Disclaimer | Comments [0] | 

 Tuesday, January 11, 2005

More garbage collector woes

I couple weeks ago, I recounted some trouble I was experiencing with the .NET garbage collector. My efforts to dispose and clean up objects, and eliminate unnecessary allocations (essentially, to be miserly with resources) brought a great deal of stability, performance, and peace of mind.

However, while much more rare, I still continued to experience random crashes in the application. This time, however, I was able to get Windbg to trap the TerminateProcess, allowing me to perform post-mortem. Here is what I discovered:

In one screen of my application, there is a PictureBox. This screen is displayed while the application is performing lengthy operations, the PictureBox contained an animated GIF. The purpose of this was to keep the user preoccupied while waiting (just like the waving flag at the top right of your browser while pulling down a page). Sounds pretty innocuous.

Well, as it turns out... behind the scenes, either the framework or Windows itself (I did not dig into the PictureBox MSIL to check) actually spins up a thread to pump the frame changes of the animation. Yes, this is a recipe for trouble. A rogue thread running in the system, whose obvious task is to update the GUI (it is pumping the animation). Hard Application crashes are a well-known side-effect of a background thread that does not synchronize all shared (UI) resource access with the owning thread.

So what was happening (confirmed by the smattering of PictureBox and lots of native WinAPI calls over the thread that triggered the crash) is that the animation pump thread was not properly coordinating with the UI thread that actually owned the PictureBox. Or perhaps it was behaving well under normal circumstances, but it was not observing the Garbage Collector's request to pause while the GC does it's thing. You see... the crashes all also happened to occur while the GC was performing a heap walk. While the GC thread is active, all other threads are *supposed* to be paused. But the animation thread was not. And thereby the waking of that animation thread to pump another animation frame, if it happened to wake while the GC was still walking the heap, would cause a very nasty crash.

Simply replacing the graphical content of the PictureBox with a static image (such as a JPG) eliminated the problem by virtue of the offending thread never being created in the first place (it is only created if you have an animated image loaded).

So my primary lesson from this is to simply not use an animated GIF in a PictureBox control. My secondary lesson is that it is still a very wise thing to explicitly clean up after yourself using Dispose() and Clear() on objects that support those semantics, and setting references to null for all non-value types as soon as they are no longer needed. Work *with* the garbage collector, not against it!

 

Tuesday, January 11, 2005 5:30:52 PM (Eastern Standard Time, UTC-05:00) #  Disclaimer | Comments [3] | 

The new Mac Mini

Apple just announced this really slick looking machine... the Mac Mini. I love the form factor, it reminds me of the old SPARC stations we used in college.

I sure would love to see a similar form factor MCE machine, priced around the same ($499). Right now, the best we can do with Media Center is the HP z540/z545 or a “build your own” using something like a D.Vine case from Ahanix.

Tuesday, January 11, 2005 5:06:31 PM (Eastern Standard Time, UTC-05:00) #  Disclaimer | Comments [0] | 

Interested in building your own Media Center Edition machine?

Perhaps, like me, you are contemplating building your own Media Center PC. Perhaps the HP set-top boxes are just too expensive. And the other products from Dell, Viewsonic, etc just look ugly or are too noisy. Well, Microsoft held a live Webcast session a couple days ago, covering the entire build process for putting together an OEM MCE system. Pretty much all you need to know about hardware selection, common issues, and configuration is covered. I tuned in, and found it to be quite informative to someone (like me) who has yet to attempt the building of one. And now, the recorded webcast has been made available for viewing. This link will take you to the registration page if you are interested (60 minute duration webcast).

Tuesday, January 11, 2005 3:10:56 PM (Eastern Standard Time, UTC-05:00) #  Disclaimer | Comments [1] | 

 Thursday, January 06, 2005

Portable Media Center competition

I just saw that there is already a linux-based offering in the PMC space. A company called ARCHOS is releasing the Pocket Media Assistant PMA400 this month (product specification page here). The feature list is very impressive, and it even claims to be capable of dealing with DRM-locked Windows Media files. And for under $800, it's quite an interesting device... especially considering it supports PDA functionality (touch screen!), wifi, wired ethernet, and even can record direct from cable/satellite.

This kind of competition is great, it will help push the envelope in the portable media and media center arenas.

 

Thursday, January 06, 2005 7:39:27 PM (Eastern Standard Time, UTC-05:00) #  Disclaimer | Comments [0] | 

Wanna see what Bill G has to say about Media Center PCs?

Bill Gates did the keynote for CES in Las Vegas, where he demo'd Windows Media Center Edition. Of course, like any Microsoft demo he ever does, it crashes at some point (I think they stage the crashes on purpose, to get media interest higher). The entire keynote is taped and can be viewed online at:

 

http://metahost.savvislive.com/microsoft/20050105/ms_ces_20050105_300.asx

 

***This link has been slashdotted, so the playback is a little rough right now. I recommend setting your media player to a 64k stream max with a sixty second network buffer before trying to view this...

Thursday, January 06, 2005 10:18:54 AM (Eastern Standard Time, UTC-05:00) #  Disclaimer | Comments [1] | 

 Tuesday, January 04, 2005

Old Media

A lot of lip service has been given to the “new media” lately, and I am as guilty of this as anyone. We like to talk about our MP3's, our DivX's, DVDs, media centers, iPods, blah blah blah....

But what about what is arguably the best media to be found? I am of course talking about the printed word: Books. They have been around far longer than any other communication device. And even in this era of eBooks, podcasts, and pay-per-view, they are still the absolutely best form of storytelling there is.

A perfect example: Rent “Starship Troopers” (~1997) and watch it with all of its special effects, cheesy lines, and ridiculous plots. If you make it through that (or at least far enough to follow my point), then go pick up Robert Heinlein's Starship Troopers (~1959) from a bookstore or your local library. There is no comparison. The original story is a masterpiece, especially considering that it was written in the 1950's. The movie is at best a mediocre mentally-void shoot-em-up special effects flick. I could list more examples (Lord of the Rings Trilogy) if I really wanted. Not in another hundred years will electronic story delivery be able to beat the tactile realness of a good printed book.

I had almost forgotten this, having been literally overwhelmed with all of the “new media technology” in the last few years. But today I happened to glance over at a bookshelf I had not payed attention to in four years. There, among the rows of books I had read, were six of them I had not. I had purchased and forgotten them. Such a waste (not of the money for the books - but of my time in the last four years). One book authored by Charles Sheffield, another by Robert L Forward, and the other four by my favorite author of all time, Greg Bear (I am so envious of Scoble).

I think it's also of no coincidence that for 4 years I have been having trouble keeping focused and motivated. Reading is a very healthy activity for the mind, and I have been neglecting this for too long now. I have recently taken the steps to rejuvenate my health (diet and exercises), but I have been forgetting to take care of my mind.

So as my first resolution for the New Year, I am going to read these six books. The 10 or so DVDs I was given for the holidays will just sit unused while I take this time to dust off the creative side of my brain.

I still cannot believe I ever allowed one of my favorite intellectual pastimes (hard sci-fi reading) fall away unnoticed like that...

 

Tuesday, January 04, 2005 7:29:29 PM (Eastern Standard Time, UTC-05:00) #  Disclaimer | Comments [2] | 

New phone devices coming from Samsung

Here is a link to some very interesting smartphone concept devices that were leaked from Samsung. I am especially intrigued by the third one:

Samsung "Thor"
Processor unknown, 3GB hard drive for memory, Bluetooth, GSM/GPRS, 18-bit Color 240x320 display, 4.45" x 1.89" x 0.83", and "MS Smartphone Magneto"”

Yes, this is a phone. With bluetooth and a 3G drive. No idea what “Magneto” is, but my guess is its a Smartphone version of Portable Media Center Edition. Wow, phone + camera + PDA + bluetooth + streaming media. This is really approaching the “all-in-one” personal device. I also notice it lists “Stereo BT Streaming” as a feature, and also “OTA / Online Music Service”... I don't suppose these will accept arbitrary sources. This could lead to *realtime* mediacasting if so.

Very promising direction here...

 

Tuesday, January 04, 2005 3:16:08 PM (Eastern Standard Time, UTC-05:00) #  Disclaimer | Comments [1] | 

 Saturday, January 01, 2005

Tips and Tricks: String.Intern()

In my recent pilgrimage through the .NET memory model, the garbage collection process, and the fine art of improving performance through runtime profiling and heap inspection, I came across this little piece of knowledge that may one day (maybe today?) benefit you as well. This is a short discussion of the subtle and rarely mentioned static function:

String.Intern()

From the MSDN documentation, this method is explained as such:

“The Intern method uses the intern pool to search for a string equal to the value of str. If such a string exists, its reference in the intern pool is returned. If the string does not exist, a reference to str is added to the intern pool, then that reference is returned.”

Sounds innocuous, but this little core feature may end up having great impact on your applications. Essentially, when you use String.Intern(), instead of making a copy of the string data on the managed heap, you instead are returned a reference to a shared copy of that same literal string. When you embed literal string values (string myString = “some value”) in your program code, the compiler automatically does this for you. Embedded literal strings will be “interned” for you without any additional effort.

So you might be wondering, “OK so now why do I care about this, if the compiler already does this for me?”. The answer is that the compiler only handles hard-coded string values. Consider this snippet of code:

public string[] BurnBabyBurn(string Input)
{
    string[] MyHugeArrayOfJunk = new string[2000];
    for (int Index = 0; Index < 2000; Index++)
    {
        MyHugeArrayOfJunk[Index] = Input;
    }
   return MyHugeArrayOfJunk;
}
[...]
string[] test = BurnBabyBurn("0123456789");

As you have probably guessed, this will make 2000 copies of the same literal string “0123456789“. Not very friendly to your memory footprint. So let us alter the previous code sample to make use of String.Intern():

public string[] ByNiceToMe(string Input)
{
    string[] MySmallerArrayOfJunk = new string[2000];
    for (int Index = 0; Index < 2000; Index++)
    {
        MySmallerArrayOfJunk[Index] = String.Intern(Input);
    }
    return MySmallerArrayOfJunk;
}
[...]
string[] test = BeNiceToMe("0123456789");

The difference is significant. This second version of the same logic will now make 2000 references to the same string value, rather than 2000 copies of it. This is a huge difference in terms of memory allocation on the managed heap (don't believe me? Try both routines and use CLR Profiler to see how much String data is sitting on your heap!).

Now sure, this example is simple and contrived (you really wouldnt ever build a 2000 element string array with the same values would you?). However, it illustrates a very useful way to reduce your memory churn in an application (and therefore reduced footprint, GC time, etc). Supposing you had a DataSet filled with values. Suppose there were 10 string columns, and 500 rows. Assume much of those 5000 string values were duplicates (result of enumeration domains, looked up values, etc)... I wonder what the memory savings (and GC savings when you Dispose() it!) would be? A hint: big enough to pay attention to.

Saturday, January 01, 2005 9:10:46 PM (Eastern Standard Time, UTC-05:00) #  Disclaimer | Comments [2] | 
View Keith Rome's profile on LinkedIn

On this page....

Archives

Navigation

Categories

Microsoft Weblogs

Web 2.0 / AJAX

Local Atlanta Bloggers

SharePoint / MOSS

WPF

Other Weblogs

MSDN Monitoring

My Blogmap

About

Disclaimer
The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

Sign In

Certification Logo Certification Logo Certification Logo Certification Logo Certification Logo

Powered by: newtelligence dasBlog 2.0.7226.0