Wednesday, July 11, 2007

Setting up a Spamtrap.

So you want to get spam? It's easy and fun -- you just need to setup a dedicated trap. Here we take a look at the situation of being behind a DSL. Here's what to do:
  • You can give your outside IP address (which is going to be your MX record, the IP address MTAs are going to use to send you spam) a symbolic name using dynamic DNS services. The idea is that if your external IP changes, you only change the dynamic DNS record and not all your MX records for all the domains you've setup to catch spam.

    EasyDNS works well, they will send periodic update asking you to reconfirm that your external IP address is the one they have in their records. They will also provide you with a perl script (ddclient, you'll need IO::Socket::SSL and Net::SSLeay which are easy to get and the configuration looks like:

    use=web, web=checkip.dyndns.org/, web-skip='IP Address'
    server=members.dyndns.org, \
    protocol=dyndns2 \
    my-easydns-domain
  • Find a old server to install Linux on. Make sure you configure IP tables. Whether you want to bother with SE linux is your choice. I don't. Don't give it any domain name.
  • Install qmail. Qmail is nice but it's a bit delicate. The software hasn't been updated in a while and the author doesn't allow new releases. So there's the official last distribution and a pletora of patches. After a couple of trials that didn't go very well, I settled for the directions given by qmailrocks which, well, rocks.

    I went through all the steps, I didn't install the following options: elzm, autoresponder and maildrop. I certainly installed vpopmail and the web based vpopmail management interface which makes it really easy to create new domains (you might want to register different domains to catch more spam.)

    During compilation, I had to create the following symlinks:

    cd /usr/include
    ln -s /lib/modules/2.6.9-1.667/build/include/linux
    I also did the same thing with asm-generic (to /usr/include/asm-generic) and asm-i386 (to /usr/include/asm) so that /usr/include/linux exist (that's for errno.h, for instance.)

    My recommendation is that you spend an hour or so reading all the steps to know what's coming to your and prepare everything before running the installation for real.

    Qmailrocks.com will walk you through all the steps, all the way down to starting qmail and making sure that it works. As far as domains are concerned (for the virtual domain or the rcpthost, I used my-easydns-domain that I registered with EasyDNS.)

  • Next you create a mail domain -- use the vpopmail user interface that should be running on your mail host. This domain could be the one you registered with EasyDNS, but you can also create others. For now, let's go with my-easydns-domain. Create a postmaster for each domain and also an account where all the spam will go, for instance spam.

    Make sure that these accounts provide limited access (no pop, no web access for instance.) Since you're going to advertise email address that do not exist (you don't want to add these users manually all the time) and that spammers are going to try their luck with possible email addresses that could exist in your domain, the easiest thing to do is to redirect all incoming email that isn't sent to postmaster to the spam account. It's easy to do: just edit the .qmail-default file that exists for a particular domain (this file exists in /home/vpopmail/domain/my-easydns-domain so that it contains:

    | /home/vpopmail/bin/vdelivermail '' \
    /home/vpopmail/domains/my-easydns-domain/spam
  • From your internal network, you can start testing that things are working. For instance, you can telnet to the port 25 of your mailhost and try a SMTP session. In bold is what you type:
    telnet 192.168.1.4 25
    Trying 192.168.1.4...
    Connected to host (192.168.1.4).
    Escape character is '^]'.
    220 my-easydns-domain
    HELO foo.edu
    250 my-easydns-domain
    MAIL FROM bar@foo.edu
    250 ok
    RCPT TO: baz@my-easydns-domain
    250 ok
    DATA
    354 go ahead
    Hello!
    .

    250 ok 1184182572 qp 17173
    QUIT
    221 my-easydns-domain
  • Now take a look at /home/vpopmail/domains/my-easydns-domain/spam/Maildir/new/ and you should see a file name 1184182572.17175.hostname,S=240 which contains the RAW mail you just sent to your domain. hostname is the name you gave to your mail host, the string returned when you type the command hostname.

  • It's a good idea to make qmail log all SMTP transaction for further analysis (for instance, you'll be able to write script to identify DHA, a simple knock on your SMTP door or transaction that fail for whatever reasons.) Here's how yo do this (thank you Chris for the tip!)

    Modify the file /service/qmail-smtpd/run to add /usr/local/bin/recordio before the invocation of /var/qmail/bin/qmail-smtpd. Once the modification is done, the file will look like (modification in bold, only the last few lines are shown:)

    ...
    exec /usr/local/bin/softlimit -m 30000000 \
    /usr/local/bin/tcpserver -v -R -l "$LOCAL" -x /etc/tcp.smtp.cdb -c "$MAXSMTPD" \
    -u "$QMAILDUID" -g "$NOFILESGID" 0 smtp \
    /usr/local/bin/recordio \
    /var/qmail/bin/qmail-smtpd my-easydns-domain \
    /home/vpopmail/bin/vchkpw /usr/bin/true 2>&1

    The logs are going to end up in the /var/log/qmail/qmail-smtpd directory. First in a file called current and then in files named after the date of their last modificatio, using an hexadecimal notation: 8 hexadecimal digits for the number of second since Epoch (as returned by ctime(3)) and the last other 8 hexadecimal digits being the fractions of seconds, all of this using a @40000000 prefix that I haven't tried to interpret.

  • Let's summerize what we have:
    • We have a TLDN that points to the IP address of our DSL modem
    • We have a mailhost that runs qmail. On this mail host we have:
      • Domains (such as my-easydns-domain) that we can send emails to, and the user doesn't have to exists, all goes into the spam account under /home/vpopmail/domains/my-easydns-domain/spam/Maildir/new.
      • qmail will save the logs of the entire SMTP transaction
We're now ready to have SPAM flow to your newly configured spam trap.
  • Just modify your DLS modem configuration so that incoming SMTP traffic on port 25 is redirected to your mail box, the one running qmail. You can conduct a SMTP test from the outside to make sure that (1) the port rediction works and (2) you're don't have a firewall rule on the DSL box or on the mailbox that prevents traffic coming from the outside to flow through port 25.
  • Now register domains with a registar (anyone you want that gives you control over the values you put in your records.) For instance, if you create serialhacker.org (this one is taken, sorry!) using the vpopmail web based admin interface, you want to register the serialhacker.org domain. During registration or after, you just set the MX records for that domain to point to...my-easydns-domain.

    Once the record have been taken into consideration (this takes more or less 24 hours) you will be able to send mail to serialhacker.org using a Yahoo! or gmail account for instance and this mail will be sent to the IP pointed to by my-easydns-domain

  • The next step is to advertise bogus email address using the domains that you registered. You can add them to web pages you maintain or post test messages to test news groups (you can automate this process using the perl Net::NNTP package in a simple script.)
  • Soon spam will flow in. It's up to you to do whatever you want with it, but I personally wrote scripts to monitor traffic: where it's coming from, its intensity, etc... These scripts are running as cronjobs to send me email if something happens...
A word of caution here: what you're going to receive is going to be extremely nasty: from offensive content to phishing attempts to viruses. You want to be extremly careful in the maner you're handling the content -- you have been warned.

Labels:

Tuesday, September 26, 2006

On offshoring...

Offshore software development management:

This is an interesting article. And I realized that we've learned very similar lessons:
  • Contact and communication
  • Give people autonomy, they crave it and love it. Apply feedback when necessary
  • Considerations for cultural changes
  • Use a Wiki to freely organize information
  • Test scripts playing a greater role in stating requirements (maybe not as much as we should here)
  • Build based feedback: we're lucky to have a large QA team that can help us test early and often
  • Regular short status meeting (yes, per functional teams)
  • Short iterations
something we haven't done nearly enough:
  • Scheduled visits
The articles stresses out some of the issues that we ran into: source control systems are slow over links, if you don't have a solution that allows for caching, it's going to be painful to check out entire trees. Milestone builds (otherwise we have our local build servers though) need to be carefully planned and you need the right resources on both sides to help you if something goes wrong (along with a clear idea of what your network availability is going to be.)

Labels:

Friday, April 07, 2006

Virtualization (or: it's about time)

Virtualization is everywhere these days (duh.) Because of increase in hardware performance, security requirements and the need to run several OS on one system, virtualization is now mainstream enough to be offered on desktop operating systems for the masses to use (Linux, OSX, Windows.) It is so far a x86 game only, although hypervisors for other architectures (like XScale) are in the work.

Hardware vendors are adding a lower level of instructions to help writing hypervisors that will control the VM supporting operating systems running at their current priviledge level. This pushes the security one level down while offering the devastating ability to compromise the software that manages the execution of all operating systems on the platform. This thread could be circumvented by the use of the TPM/PKI to ensure that the hypervisor isn't being tempered with (take a look at this diagram)

Some articles I should have read a long time ago.

There's certainly hope that hardware will address some of the existing performance degradations brought by virtualization. Some applications that could greatly benefit from virtualization can't suffer the I/O performance degradation that existing virtualization technology fails to entirely alleviate.

Labels: ,

Monday, December 19, 2005

Firewall failover

I just ran into a nice write up of stateful failover capable firewall running BSD pf.
  • CARP (Common Address Redundancy Protocol) is used to switch identity during failover and CARP traffic is used as a measure of the availability node (the master advertises using CARP and if the backup doesn't hear from its master, it'll start advertising itself. Carp is IP protocol #112. CARP has a ARP balance feature that can be used to direct traffic to particular hosts and can be seen as similar to VRRP but presents the further advantages of being more secure and not encumbered by patents.
  • pfsync implements the IP protocol #240 and performs connection state synchronization so that a stateful failover (i.e. not TCP connections are lost) can be supported. A node joining the firewall cluster will receive a bulk update of the existing connections and then will be updated periodically on a best effort basis.

Labels: ,

Friday, November 04, 2005

Layer switches.

This `switch' terminology that applies to all layers of the OSI model is getting confusing. I needed a clear picture so I went to do a quick search and here's what I found:

Layer 2 switch:
  • that's your regular switch. It learns what traffic shows up on what port to direct direct there faster instead of forwarding to all the ports like a hub does. It operates at the MAC address level.
Layer 3 switch:
  • Layer 3 switches are high performance routers. The switching part of the business is done in hardware instead of being handled by a CPU and packets are switched based on their IP address, by doing routing table comparison entries. The routing tables are of course populated as the result of responding to routine protocols. More on the topic can be found here.
  • Layer 3, being switches, are aware of VLAN and can route traffic between VLANs.
Layer 4 switch:
  • It does policy based switching -- a lot of what a firewall does if the device acts on layers above 4. A L4 switch can also do load balancing by identifying sessions and directing them to a dedicated server. See more here and here too.
Further into blurrying layers, MPLS (Multi Protocol Label Switching) that inserts layer 2 and/or layer 4 information into tags that exists at layer 3 to be handled by specialized equipment (LSR: Labelled Switch Routers) to circumvent congestions, bottlenecks or link failures.

Labels:

Tuesday, October 04, 2005

STP based attacks.

Following a link from the Linux Bridging Ethernet project, I got to read about the security implications of bridging and the use of the STP (Spanning Tree Protocol.) The attacks are all based on abusing:
  1. The inherent trust that exists between bridging equipment in terms of Bridge Protocol Data Unit acceptance.
  2. The implementation of the topology management that mandates ports to sometimes (partially) block traffic.
Here are some possible attacks:
  • Trigger eternal elections of the bridge root. Upon detection of prediodical BPDU packets, the attacker replies with a BPDU claiming that its root status superceeds the one expressed in the packet it just detected. While in bridge root election mode, traffic forwarding on ports is disabled, the eternal elections lanch a DoS attack.
  • This type of election based attacks can be made localized, isolating clients on one segment served by a bridge from a server located on a segment served by an other bridge, giving a chance to the attacker to impersonate the server.
  • An attacker equiped with links to two independently connected bridges can sever the bridges connection (by initiating and winning elections to be the designated bridge for the two segments) and becomes trusted in forwarding packets between the two bridges, effectively perpetrating a MITM attack.
  • STP extensions disabeling the Learning state on user ports (Port Fast, Fast Start, etc...) for reasons of responsiveness will, upon perpetual elections, force the switch reset its switching table making the interfaces in then promiscuous mode subject to APR cache poisoning and the like types of attack.
In the light of these vulnerabilities, assessment of the use of STP as well as the STP enabled equipment deployment should be made and decisions on STP use should be taken.

Labels:

Tuesday, June 14, 2005

HTTP request smuggling.

This paper recently posted on /. introduces HTTP request smuggling as a way to exploit discrepencies in the way applications parse HTTP/1.1 requests and act on their content.

Specially crafted combined HTTP requests can lead one application to see a certain request with a certain content. The data then reaches a second application where it is decoded differently:
HTTP_REQUEST .... ; Seen by application A and B
...
HTTP_REQUEST ... ; Seen by application B
...
HTTP_REQUEST ... ; Seen by application A
This discrepancy is exploited to:
  • Poison a web cache: the web cache A sees content 1 but the web server B see something different and instead serves content 2 that gets associated with content 1 by web cache A.
  • Make a firewall such as an unpatched FW-1 R55W not see malicious content in a page and passing it down to IIS where it will be wrongly absorbed (because of a IIS limitation/bug.)
  • Smuggle an XSS attack.
HTTP request smuggling gravitates around misformed HTTP requests such as:
  • Double content-length statements advertising different lengths -- some application pick the first as being the right one, some pick the second one as being the right one, leading to different content interpretation.
  • GET requests plus content-length.
  • Buffer size limit anomalies, such as the IIS/48k limit
What's really to blame in HRS is that all implicated applications are using different HTTP parsers, with different interpretation of edge and borderline cases. The HRS techniques is similar to the HTTP Response Splitting, presente here: HTTP response splitting relies on application bugs that will generate two responses for one request, with the second response content being controlled by the attacker -- the attack works by sending a first crafted request to the application that will generate two responses. A second request is sent to be matched by the second response. Imagine that the two responses are managed by a web cache, cache poisoning is effectively achieved.

Labels:

Friday, May 27, 2005

Firewalk

I've been reading a bit about firewalk, a traceroute variant with knowledge of what a firewall might
do in order to stop traffic at its door. Here are some notes.

On using traceroute
  • Letting traceroute use UDP packet and watch it not being able to report on tells us that some filtering is happening on some hosts. For instance, if we see this in our traceroute output:
    13  193.251.243.30  180.328 ms  183.666 ms  172.724 ms
    14 * * *
    15 * * *
    We know that at 193.251.243.30 some UDP filtering is happening. If instead we force traceroute to send ICMP packets (-I option,) we might start to see what's behind the filtering host:

    13 193.251.243.30 164.937 ms 169.354 ms 170.566 ms
    14 193.251.251.54 175.550 ms 177.654 ms 173.038 ms
    15 193.252.117.254 173.594 ms 170.467 ms 180.090 ms
  • If the filtering host was blocking ICMP traffic as well, we could try to tickle it with some other UDP traffic it might accept such as DNS. The trick is then to reach the host with the right port number (since traceroute increment the port number for each attempt made) The computation is rather simple, based on the number of hops and the number of attempt. With the-p option, traceroute traffic will reach the filtering host with the right port number. It's also easy to modify traceroute source code not to increment the port number, so that tracing can start at a static port number of your choice.
Firewalk

Firewalk use a combination of all these methods and works once you've (1) discovered a gateway (like 193.251.243.30 in the example above) and (2) discovered a host behind the gateway (like 193.251.251.54 in the example above.) It will send the IP traffic of your choice to ports, using the right TTL in an attempt to find out what's behind the filtering agent, hereby obtaining a map of interesting hosts behind a firewall.

While reading the firewalk paper, I wrote a unpretentious perl script that invokes traceroute and tries to report any host that will do UDP filtering (by comparing the invocation of traceroute with using UDP or ICMP.) Just for kicks, it'll try to add geo location information with the hosts it discovered. Here. Here's an invocation sample:


$ geotraceroute.pl wanadoo.fr -geolocate -flag_gw
10.2.0.1: Marina Del Rey, California, United States.(lat=33.98, lon=-118.45)
10.10.72.254: Marina Del Rey, California, United States.(lat=33.98, lon=-118.45)
209.172.100.193: San Jose, California, United States.(lat=37.34, lon=-121.89)
209.172.121.229: San Jose, California, United States.(lat=37.34, lon=-121.89)
209.172.123.1: San Jose, California, United States.(lat=37.34, lon=-121.89)
140.174.37.61: Englewood, Colorado, United States.(lat=39.58, lon=-104.90)
129.250.26.38: Englewood, Colorado, United States.(lat=39.58, lon=-104.90)
193.251.250.41: Amsterdam, North Holland (province), Netherlands.(lat=52.35, lon=4.90)
193.251.240.2: Amsterdam, North Holland (province), Netherlands.(lat=52.35, lon=4.90)
193.251.241.133: Rue, Somme (department), Picardy (region), France.(lat=50.27, lon=1.67)
G 193.251.251.54: Rue, Somme (department), Picardy (region), France.(lat=50.27, lon=1.67)
193.252.117.254: Issy-les-moulineaux, Hauts-de-seine (department), Ile-de-france (region), France.(lat=48.82, lon=2.27)
193.252.122.2: Issy-les-moulineaux, Hauts-de-seine (department), Ile-de-france (region), France.(lat=48.82, lon=2.27)
193.252.122.18: Issy-les-moulineaux, Hauts-de-seine (department), Ile-de-france (region), France.(lat=48.82, lon=2.27)
193.252.122.103: Issy-les-moulineaux, Hauts-de-seine (department), Ile-de-france (region), France.(lat=48.82, lon=2.27)



Notice the entry marked with a leading G. It indicates that this host filters UDP but not ICMP. It might just be a gateway. With netpbm it should be possible to map all the entries on a world map. Supported flags are -debug, -resolve, -flag_gw and -help.

Labels:

Thursday, May 26, 2005

Metasploit

I'm looking at the Metasploit source code. It's written in Perl and fairly well organized. It seems to contain libraries that are worth looking at. A couple notes:
  • It ships with NetPacket, a perl package that allows for packet crafting -- neat, just combine it with Pex::Racksocket
  • Their lib/Pex section contains all sorts of routine for low level manipulation of data available as packages. For instance, the x86 one can help you generate x86 machine code. It's used to achieve parametrization of an exploit or a payload.
  • But I saw also mention of InlineEgg, which per its documentation seem to be even more powerful.

Labels:

Thursday, May 19, 2005

DNS Cache poisoning

Read this good Honeynet article on phising, I followed links of pharming or how to redirect traffic to a known site to a site of your choice, which can be achieved by DNS cache poisoning.

Basically, once you tricked a DNS server to consult an other DNS servers, the information sent back can contain new IP address assignments for some domain names -- anything goes as far as having the victim DNS server consult an other DNS server: email to a non existent user, embedded image links in email, banner ads, etc.

If your DNS software is flawed or not configured properly (NT4 and 2000 have insecure default configuration,) it will accept these as a replacement of what it already knows about the said domain names. M$ is of course the prime target, and also some DNS packages from other well known security companies. Nice.

Labels:

Tuesday, May 17, 2005

Crypto Refresh.

I 'm refreshing my knowledge on applied cryptography.

Symetric cryptography:
  • Symmetric/public key crypto: parties share a secret, for instance an encryption key. Shared secrets must be communicated securely (with appropriate public/private key crypto, agreed upon generated key after secret signing, via a third party, etc... methods abound.)
  • Two set of primitives: symetric encryption algorithms (ensure data secrecy) and message authentication codes (MACs) to ensure data transmission integrity.
Example of an symetric encryption scheme:
  • A block cipher mode and encryption method must be selected
  • A key is picked, for instance, a password can be used to generate a key (involves salt, iteration numbers.)
  • The block cipher mode determines how the the encryption method will be used. For instance, it can be choosen to prevent the same unencrypted input from yielding the same crypted output (better resitance to dictionary attacks.) For instance one could XOR the first chunk with a randomly generated string (an IV: initialization vector, which has to be transmitted) and then have this block encrypted. For the next block use the previously encrypted block for the XOR operation (this is how CBC works -- just encrypting blocks as they come is ECB, a bad block cypher mode.)
  • At the heart of the modern computerized encryption schemes are Feistel rounds which use permutation boxes, substitution box (S-boxes) on subkeys and the data to encrypt to achieve Shannon's confusion and diffusion.
  • Recommendation would be to use CBC/AES.
  • Outside of block cipher mode consideration, decryption is achieved by either running the same encryption scheme on the encrypted data, or by running a dedicated decryption function.
Hashes and messages authentification codes (MACs):
  • Cryptographic hash function: process input and produce a fixed sized output (hash value or message digest.) Properties: one wayness, noncorrelation (bit flip resistant) weak/strong/partial collision resistance.
  • Universal hash function: keyed hashes.
  • MAC: hash function processing a message with a secret key (+ possible nonce) to give output that can't be obtained without the key.
Public key cryptography, digital signature:
  • Involves large prime numbers and factorisation properties.
  • Allows for key agreement, digital signature and identity establishement.
  • Note that its 1000 times slower than symetric key cryptography when comparison is applicable.
  • RSA (does all) Diffie-Hellman and DSA (digital signatures only.)
  • Public/private key crypto: your public key is available to parties use to encrypt data that only you and your private key can decrypt.
  • PKI is used to establish trust between entities. Before using a public key, you must be sure it belongs to who you want to send a message to. Signature: a document is hashed, and the result encoded with your private key. Parties can use your public key to retrieve the hash and compare it to what it's expected to be. If the hash can be decrypted with your public key, you must be the one that signed it with your private key.
Signing and encrypting with public/private keys:
  • Concatenate the recipient's public key with the message and sign/encrypt the result
  • Simply signing and encrypting the signature and the message doesn't work as an intermediary can re-sign with somebody else's public key.
Key and certificates encoding:
  • Keys and certificates can be encoded to a binary object (DER: Distinguished Encoding Rules encoding)
  • Keys and certificates can be encoded to plaintext (PEM: Privacy Enhanced Mail encoding)
Authentication and Key Exchange:
Further read here. A nice concise presentation on some crypto basics here.

Labels:

Monday, May 16, 2005

TiddlyWiki (GTD TiddlyWiki)

I ran into this thing today. Great: it's a mix of HTML/CSS/JavaScript providing an application that runs in your browser without requiering a server. It seems easy to modify, but there's no real separation of code and data, but updating seem easy as it is described.

Put that on a thumbdrive, along with a copy of FireFox for all popular platforms, and you should be able to edit content everytime you find a computer. Neat.

It's here. The GTD evolved from TiddlyWiki, adding multiple features.

Labels:

Friday, May 13, 2005

Channel Attacks

This excellent page (along with a really good paper) made me discover the concept of channel attack. Basically, a channel attack consists in discovering something about data being processed by measuring the side effect of processing them, such as latency or even power consumption. Here's an other paper on channel attacks. Fascinating stuff as they seem to be really hard to close (they rely on the nature of the data processing that is being performed.)

The page that triggered the post speaks of channel attacks in the context of hyperthreading able CPUs. Recommendation to limit exposure are numerous:
  • HT aware cache sharing prevention
  • OS level prevention of execution of applications of different privilege level on the same core.
  • Crypto library should be redesigned to prevent channel attacks
Now here's something about an instance of side channel attack, and the process of a side channel attack is presented here.

Labels:

Monday, May 09, 2005

Cross Site Scripting

A recent Firefox security alert raised the possibility of cross site scripting vulnerabilities for Firefox users. As I didn't know anything about cross site scripting, Google came to the rescue and pointed me to this handy PDF. The idea is to entice the victim to visit a legitimate web site while forcing its browsing agent to execute malicious code while vising the site.

How does malicious code gets embedded in a page served by an uncompromised server? For instance by placing that code into a CGI parameter the targetted site accepts and whose content be will served back to the user. The malicious code, usually Java Script, can then access user sensitive information stored in the navigator and send them back to a web site set by the attacker for data collection -- cookies are a prime target (i.e. document.cookie) since, under certain circumstances and with cookies set properly, one can access a web site while impersonating a legitimate user.

The <script> is the one that comes to mind to be used to trigger JavaScript execution by the victim's web browser while accessing a legitimate site, but other tags like <img> could be used as well (through a src attribute set to javascript:.... Even form's input widget value tags could be used to carry mallicious data through the use of a "> to allow for free HTML escape.

Of course, besides the operator clicking through a possibly unsollicited URL it was provided with, the fault lies in the site that serves parameter values back to the user without doing any sanitization performed on their content (or input validation.) The web application should be written to check incoming parameters for acceptable values and discard or reject anything that could be abused -- the parameter input range should be carefully specified in that regard. Note that all input sources should be subject to filtering: query parameters, body parameters of POST request and HTTP headers.

Output filtering could also be performed on the web server prior to sending the page back to the victim, and content inspection capable firewalls could also be deployed for ingress and outgress filtering.

See also: CSRF or XSRC (Cross-site Request Forgery:) here.

Labels:

Wednesday, April 06, 2005

Secure Programming Cookbook.

A coworker lent me a copy of O'Reilly's Secure Programming Cookbook. Just by reading the table of content you get an idea of what you should be thinking about, the rest really are (as being a cookbook) implementation details.

Some notes on file use:
  • Don't create a file using fopen as the permissions are 0666 (umask modified.) I never liked the f function anyways.
  • Avoid TOCTOU (Time of Check Time of Use) race conditions (before the time you check and the time you use, the attacker as done something to your target and you operate on something that's not what you thought of in the first place.) The best way to avoid these when dealing with files is always to operate on the file descriptor, as the underlying file object doesn't change. If you're using function relying on a string to determine the filename you're 1) wasting cycles and 2) exposing yourself to a TOCTOU based attack.
  • Unix/Linux doesn't support mandatory file locks very well. Windows gets it right (a file lock is a lock that, once aquired on a file, prevents other processes from accessing the file in the way described by the lock while the lock is held.)
  • To make sure a temporary file can't be used by anyone else: after the file has been opened, delete the file. The process owning the file descriptor will still be able to use the file, while no one else will because the file doesn't exist by name.

Labels:

Tuesday, March 29, 2005

On botnets.

The other day I was reading an this link from an entry of .tHE pRODUCT blog. It linked to a recent Honeynet piece on botnets. The article is fascinating and well worth the read. It yields very valuable URL for further exploration on the topic. Here are a few notes.

Here's how a botnet is built:
  • Vast section the public internet are scanned
  • Attacks are launched on 445/TCP (M$ directory service), 139/TCP and 137/UDP (netBios) or 135/TCP (M$ RPC.) Unpatched XP or SP1 patched XP hosts consitute the bulk of the victims
  • An IRC client (IRC bot) is installed on the system and it connects to a central server from where they can be managed (uploaded software for instance) or used (DDoS attacks, spamming, keylogging, attacks on IRC networks, Google AdSense and other automated clicking abuses, etc...)
  • Once compromised, a machine will try to compromise more systems (such as its peers as reachable through examined system configuration)
A compromised machine is called a Bot. A collective of bots is a botnet. Botnets can comprise hundreds of thousands of machines. They are built and used for fun and profit. Note that an unpatched and unprotected Windows box will be compromised within 10 minutes or less on average.

Here's how bots are caught:

An attractive victim (honypot) is placed behind a honeywall, a system running snort_inline so that the IRC traffic can be observed while rendering the bot harmless. The IRC client is also replaced by something that won't be detected as foreign when the bot joints its botnet -- honeypot wrote their one called Drone.

Labels:

Friday, March 25, 2005

XML-RPC, SOAP, etc...

1. XML-RPC

A XML file is used to describe a procedure invocation -- the method name is mapped by the server/CGI interface as some executable resources to access, and the arguments can be of scalar and non scalar (such as array a structures) types. The method invocation returns through a XML file that defines the set of returned results or errors.

Here's a HOWTO/tutorial on XML-RPC.

BLOB upload can be problematic, an alternative is described here. An implementation on the client side can reuse the user agent created for the Frontier::Client and the returned XML decoded through the client {'enc'}->decode method. On the server side, a Frontier::RPC2 object can be created in order to use the encore_response method to produce the returned XML.

2. SOAP

SOAP (Simple Object Access Protocol) is a XML based protocol to let applications exchange information over HTTP. XML is used to transport the message featuring:

  • An enveloppe (to identify the message as a SOAP message)
  • Optional header
  • A body containing call and response
  • Optional Fault section for error reporting on message processing
A SOAP method is an HTTP request/response that complies with the SOAP encoding rule (HTTP + XML = SOAP.) A SOAP request can be an HTTP POST or GET request. Content-Type can for instance be application/soap+xml; charset=utf-8.

3. Comparison

I found this comparison of XML-RPC/SOAP interesting.

XML-RPC is really simple and to the point. SOAP picks things up where XML-RPC left them and tends to be bigger and bloatier. It is endorsed by the big of the industry (IBM and M$, with IBM having the most complete implementation.)

4. Implementation

XML-RPC seem to be enjoying the vastest array of implementation: Python, C, C++, Java and Perl to name a few. Available and open SOAP implementation seem to be written in Java but there's at least one C++ based implementation. SOAP C++ implementation is here (but it's unclearwhether Java is required or not...)

XML-RPC seems to be fairly easy to integrate at the Perl CGI/bin level. The Frontier::RPC2 module provides what is required for XML input parsing and output formating -- it pretty much works straight out of the box but base64 support is said to be buggy. An alternative to try would be RPC::XML.

Note on the perl stuff. If you're accessing through https, you will get a `Protocol scheme 'https' is not supported' if IO::Socket::SSL isn't installed on the client (for a second you might thing it's a Frontier package problem but it's not.) Also, once you created a server, setting $server->{'debug'} will turn debug printout on. But for some errors, its lacking a bit and you're better off exploring the content of the returned value in Frontier::call() (an HTTP code is alway printed though.)

Something else I haven't read yet: Zope and XML-RPC here.

A modern use of SOAP is SOA (Service Oriented Architecture.) There's a description of SOA here.

Labels:

Thursday, March 24, 2005

Packet filtering with iptable

Netfilter implementation

Just as a reminder: netfilter provides a way for kernel modules to register callbacks/hooks (there are 5 of them) with the network stacks (IPv4,6 and DecNET.) iptable implements in the kernel a named array of rules. For the purpose of implementing a firewall, tracking connections or doing NAT, packets arriving at the netfilter hooks are sent traversing iptables and might or might not get out of it alive.

About the use of the limit module (found in the Packet Filtering HOWTO, from netfilter.org:)
  • By combining the number of packet to accept per unit of time (like a second) and the number of burst packet, one can implement syn-flood protection. For instance: -p tcp --syn -m limit --limit 1/s will make you accept five SYN request in burst, after allowing only one per second and reconstituing the burst buffer by one shot every second.
  • With -p tcp --tcp-flags SYN,ACK,FIN,RST RST... and limits, one disable a furtive port scanner.
  • With -p icmp -icmp-type echo-request and limits, one deals with the ping of death.
Note that the rules can be injected into the packet filtering infrastructure using the iptables commands, but the libipq API is provided as a way to interact from user space with iptables. For instance, there's a version of snort that receives packets from iptables instead of getting them from libpcap and will inform iptables whether the packet should be dropped, rejected or modified.

Firewalling with netfilter

Existing chains:
  • PREROUTING: before the route decision is taken (does mangling and nat)
  • FORWARD, when the firewall is routing input traffic, goes to post-routing (does mangling and filtering)
  • INPUT: on the incoming traffic (does mangling and filtering)
  • OUTPUT: the outgoing traffic, goes to post-routing (does mangling, nat and filtering)
  • POSTROUTING (mangling and nat)
Each tables (filter, mangle and nat) in a chain can be filtered. Each chain implements rule based filtering on its table, and after consultation, the fate of the packet will be decided upon: ACCEPT or DROP. If no rule matches the packing under scrutiny, then the default chain policy is applied -- usually, you want to DROP the packed if you failed to successfully determine its fate after filtering.

Connection tracking:

Tracking connections at the firewall level is important -- for instance, you want to allow new and established connection to leave your network, and established connection to enter your network. Note that after a connection has been established and a few packets have been sent, the connection is declared assured.

Tracking TCP connections closure is done by recognizing the FIN/ACK or RST sequence, and letting the connection enter a time wait status to give a chance to all packets to traverse the firewall ruleset (thing about out-of-order packets reaching the firewall after the connection reset packet has been received.)

Note that UDP and ICMP traffic is tracked as connection in the sense of monitoring what is being sent to who and back, although the UDP and ICMP protocols aren't establishing connections by themselves. Since ICMP packets can be sent back to connection orignators to indicate of a problem, they should be considered as related to other connection traffic.

Some complex protocols (such as FTP) that are sending related connections requests in the a control connection require the firewall to examine the traffic to mark requested new connection as related.

Further readings that I should indulge in:
  • PF stateful packet filter porting
  • Design and performance of OpenBSD's pf

Labels:

Wednesday, March 23, 2005

RAW socket programming.

RAW:

A recent openBSD exploit eventually got me into looking at RAW socket programming. The exploit is fairly recent and can be found here. The checksum routine doesn't work BTW (use any other implementation and it works.)

I took a look at the patch that fixes the exploit, but I don't know enough of openBSD to understand why it crashes some specific kernels -- but it looks like a bogus timestamp can trigger the computation of a TCP retransmit timeout that will eventually crash the system.

packet generation utility:

All this makes me want to write a generic packet generator. netcat (see tutorial here)
is nice but I guess what I want is something that can let me craft packets the way I want, for instance
  • packet -src=192.168.0.1 -dst=192.168.0.2 -proto=tcp -ttle=255 -tos=0 -fragment=no -sport=1234 -dport=80 -seq=rand -ack=rand -flags=SYN,ACK -window=512
Or
  • packet -ether-type=arp -ether-src=00:... -ether-dst=broadcast

You get the idea.

Labels:

Friday, March 18, 2005

Passive OS fingerprinting.

The classic on OS fingerprinting is here. Active OS fingerprinting relies on sending packets (mostly TCP and ICMP) to open or closed ports and observing the answer, but this is rude.

p0f is a passive OS fingerprinting that allows for all sort of interesting application. It works best when it sits waiting for packets to showup for analysis. For instance, it could be installed on a web server to look at incoming TCP packets to find out what is connecting to it.

Here's how it figures certain things:
  • the uptime: the timestamp on SYN requests here (but this depends on the OS: Linux seems to be using ctime, Windows is using some HZ increment.)
  • The link type: with the gathered MSS/MTU (packet -vs- payload size) values
  • NAT: analyzing disparities in fingerprinting received for the same IP (link type, OS identification, etc...)
Just a few hints:
  • Look at the TTL value in the received packet and do a traceroute to figure the TTL: 64 is common for Linux/BSD, 128 could be a Windows box
  • Look at the Window Size: 0x1600/0x2D00 or so and somewhat constant through the connection is common for Linux.
  • Changing through the life of the connection is common for Windows.
Mention of p0f fetched here.

Labels: ,

Python.

So far, the things I like about Python:
  • the functional programming stuff: filter/map/reduce
  • ranges
  • list comprehension
  • sets and operations on sets
  • loops and iterators/generators
  • introspection
  • continuations (code invocation context awarness)
Things I like less:
  • Implicit declaration of variables and fields
  • No functions with multiple signatures, unless I'm mistaken

Tuesday, January 25, 2005

BPM.

Some acronym maping:
  • BPM: Business Process Management (wiki)
  • ERP: Enterprise Resource Planning (wiki)
  • CRM: Customer Resource/Relation Management (wiki)
  • SCM: Supply Chain Management (wiki)
BPM is a relatively recent concept. ERP, CRM and SCM are older solution still influencing BPM. BPM has been going very strong the past few years, and will continue to grow as more companies find benefits in deploying it.


Friday, November 19, 2004

On threading.

Add: read/write lock, spinlocks, valgrind stuff, read ntpl source code (implementation details), Pthread and MPI, NUMA

Question: linux kernel version, old LinuxThreads or NPTL compiler, MPIAND threads?

The OpenGroup specs for threads (and other.)

LinuxThreads -vs- NTPL:


LinuxThreads:

Some documentation here

There's a manager thread (handles thread creation/destruction, fatal signals, memory management (allocated stack, thread local data, etc...), waiting on dead threads.


Synchronization primitives with signals (spurious wake-ups, pressure to the kernel signal system.) Broken SIGSTOP/SIGCONT implementation (ctrl-Z doesn't work) Limits on number of threads.


Reliance on manager thread imposes performance penalties: because of
serialization of creation/deletion, monopolization of one CPU,increases context switching.

List of all thread is maintained, to implement pthread_key_delete for instance. /proc clutter.

NTPL:
Correct implementation POSIX signal handling in the kernel solves fatal signal handling issues.

Kernel can do thread memory deallocation. Kernel can reap terminated threads.

Thread specific data and local storage are managed through generation counters.

Synchronizations primitives are implemented with futexes. (which can be placed in shared memory, so that PTHREAD_PROCESS_SHARED can be implemented (mutex shared between threads belonging to separate processes))

Thread local storage and thread data structures are merged in one block and placed on the stack. Stack frames can be cached for thread creation/deletion performances (stack frame unmap are costly depending on the architecture)

Required kernel support:

  • Arbitrary thread specific data areas support

  • Extension of clone to optimize thread creation and facilitate termination (no manager thread required.)

  • POSIX signal handling for multi-threaded processes, fatal signal terminate entire process. Stop/continue affects entire process (ctrl-Z works)

  • exit_group syscall added.

  • And more...

There's a testing effort ongoing: see results here. It's based on the POSIX test suite.

There are scalability limits that started to get documented


1-on-1 -vs- M-on-N

Kernel threads are used (pure user-level implementation makes multi-processor use impossible.) M-on-N schdules M user threads on N kernel threads: two schedulers at work, need cooperation. Doesn't fit well on Linux (user level context switch often requires copy of register content from kernel space.) O(1) scheduler negates advantage of user level scheduler. Overhead cost. Simplifies signal delivery.



Message queues:

Inter process message exchange.
  • mq_open create a message Q. It's identified by a filename path starting with / (no directories allowed,) features R/W/RW priviledges. Attributes (mq_maxmsg and mq_msgsize) can be specified if privlegdes for the given message queue are
    granted (permission are resolved on name as for file access.)

  • mq_send sends a message to a message Q. The priority argument determines where the message is inserted in the Q. When the Q is full, the caller blocks (unless O_NONBLOCK) If several threads a blocking, the highest priority one is awaken to send the message.

  • mq_notify subscribe to message notification. A struct sigevent argument specifies how to be notified (and how the poster will be notified of the IO completion ?)

  • mq_receive receives the oldest message of the highest priority. Same blocking policy than for mg_send.

  • Attributes: mq_maxmsg, mq_msgsize, mq_curmsgs and mq_flags (O_NONBLOCK)
Initialization:
pthread_once: Argument pthread_once_t is initialized to PTHREAD_ONCE_INIT and the function is called with a function pointer to execute. Subsequent call won't do anything as the value in pthread_once_t is modified to mark that the initialization happened.
Thread Specific Data:
Can be viewed as a thread private array of PTHREAD_KEYS_MAX void * addressed by keys. Keys are common to all threads, but values are private. TSDs are disposed of when cancelling or exiting a thread.
  • pthread_key_create allocates a new key (whose value is returned through a parameter) and set the initial value to NULL. A destructor can be attached to the key for destruction at cancellation or exit. The destructor doesn't run if the associated value is NULL, but associates NULL to the value it's destroying.

  • pthread_key_delete deallocate a TSD key, but doesn't run the destructor and doesn't care about the deallocated value.
This is the old way of doing things. Nowadays, we let the compiler handle things by using the __thread keyword.
Synchronization:

Mutexes

Mutexes shouldn't be called from signal handler (not async signal safe.) Mutex functions aren't cancellation point (don't hold mutexes for long when cancellation is deferred.)

Two states: unlocked (not owned), locked (owned by one thread.) Acquiring an owned thread will make the thread block.

  • pthread_mutexattr_init: Initialize a mutex with attributes (mutex kind determines what happen for pthread_mutex_lock on already owned mutex: PTHREAD_MUTEX_FAST_NP suspends calling thread for ever, PTHREAD_MUTEX_RECURSIVE_NP returns immediately with success code, number of time thread owning the mutex is recorded and must be matched by number of pthread_mutex_unlock; PTHREAD_MUTEX_ERRORCHECK_NP returns with error code EDEADLK.)
  • pthread_mutex_lock: if unlocked, the mutex is set in the lock state and to belong to a thread. If locked, depends on fast, recursive and error check mutex kind.
  • pthread_mutex_trylock: just like pthread_mutex_lock but return EBUSY if mutex is owned.
  • pthread_mutex_unlock: fast mutexes are returned to the unlock state, recursive mutexes have their reference count decremented and the mutex unlocked when count reaches 0, error checking mutexes are checked for lock state and owning thread appurtenance (error code may be returned.) Fast and recursive mutexes can be unlocked by non owner. This is not a portable behavior.
  • PTHREAD_MUTEX_INITIALIZER, PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP, PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP: Static initialization macros.
  • pthread_mutexattr_destroy does nothing for Linux except check that the mutex is unlocked.
  • pthread_mutexattr_{set,get}type set/query mutex kind attribute.

Condition variable (condition)

Additional info here. Conditionals are implemented with FUTEXES (wait queues).

Mutexes control access to data. Condition variables provide synchronization on data value, without resorting to polling. Mutexes are required to work with condition, because of race conditions (to avoid a thread signaling a condition before and other thread can wait on it.)
  • pthread_cond_init: There are no condition attributes supported on Linux implementation. Static initialization through PTHREAD_COND_INITIALIZER
  • pthread_cond_wait: atomically unlocks the associated mutex, and waits on a condition. pthread_cond_timedwait add a timeout (And a may set ETIMEDOUT)
  • pthread_cond_signal: restart one waiting thread that a condition has been met. pthread_cond_broadcast restart all waiting threads (thundering herd problem/scheduler trashing prone.) When thread is restarted, associated mutex is automatically and atomically locked again.
  • pthread_cond_destroy: destroy ressources used by condition. Does nothing on linux except check that no thread waits on cond
Semaphores
Semaphores are counters for ressources shared between threads. They are incremented/decrement atomically.
  • sem_init initializes a semaphore with an initial value. Semaphores can be shared amongst processes (LinuxThreads doesn't support this.)

  • sem_wait suspend calling thread until values as a non zero count, then decreases the semaphore count. It is signal safe (the only POSIX synchronization function that is, LinuxThread implementation isn't) sem_trywait is the non blocking variant (EAGAIN returned if count is zero.) This is the P operation.

  • sem_post Atomically increases the count of the semaphore and never blocks (this is the V operation)

  • sem_getvalue gets the current count of a semaphore.

  • sem_destroy releases ressources allocated for a semaphore. For LinuxThreads, just check that no thread is waiting on the semaphore.
  consumer:
loop:
P (); ; Mutex initialized to 0, consumer blocks
consume;
goto loop;

producer:
loop:
produce;
V (); ; 0 Initialized mutex brought to 1, consumer awakens
goto loop;
Barrier
Barriers set the number of threads that must reach a barrier before all of them can be allowed to continue.
  • pthread_barrier_init Initializes a barrier with a count number.
  • pthread_barrier_wait Thread blocks until the required number of threads have reached the specified barrier.

Thread management:

pthread_cancel:

Sends cancellation to thread. Receiving thread can ignore the request, honor it right away or defer until the next cancellation point. Receiving thread exits as if pthread_exit(PTHREAD_CANCELED) had been called.
  • pthread_setcancelstate sets either THREAD_CANCEL_{ENABLE,DISABLE}

  • pthread_setcanceltype sets PTHREAD_CANCEL_ASYNCHRONOUS (as soon as cancellation request is received), PTHREAD_CANCEL_DEFERRED (next cancellation point.

When cancelling: execution of cleanup handlers (reverse order, LIFO), finalization for thread specific data and return PTHREAD_CANCELED.

pthread_cleanup_{push,pop,push_defer_np,pop_restore_np}:

Manages cleanup handlers:
  • pthread_cleanup_push: install a cleanup handler, called when thread terminate (cancel or exit.) LIFO.
  • pthread_cleanup_pop: remove last cleanup handler, with possibility of executing it.

    Note: these matching pairs should be in the same block/function (they're macros and introduce a {/} sequence.
  • pthread_cleanup_{push_defer,pop_restore}_np Non portable extension (set PTHREAD_CANCEL_DEFERRED, push cleanup handler. pop cleanup handler (with possible execution) and restore cancellation type.
Cancellation Points
In general, any function that might suspend the execution of a thread for a long time, should be a cancellation point. In practice: depends on the implementation and how POSIX it is. Cancellation point can be explicit or implicit.
  • pthread_testcancel: tests for a pending cancellation, effectively establishing a cancellation point.
  • pthread_cond_{timed}wait, pthread_join, sigwait and sem_wait.
  • All other syscalls that cause a process to block: read, select, wait and whatever in the libC uses them.
Thread linux special:

pthread_atfork:

Due to implementation limitation: fork on a threaded application duplicate the currently running thread but not others. Mutexes are duplicated in their current state, this gives us a chance to set things straight.
Thread attributes:
detachstate:

  • PTHREAD_CREATE_JOINABLE: thread termination synchronization possible through pthread_join (termination code available.) Allocated thread ressources reclaimed after join.

  • PTHREAD_CREATE_DETACHED: no join synchronization, ressources reclaimed when thread terminates.

  • pthread_detach can force PTHREAD_CREATE_DETACHED

schedpolicy:

  • SCHED_OTHER, SCHED_RR, SCHED_FIFO (both require super user priviledge.)

  • pthread_setschedparam can change schedpolicy on running thread.

schedparam:

  • Set priority value for SCHED_RR, SCHED_FIFO scheduling policy, withing sched_get_priority_{min,max} range (range depending on the policy), usually 0 or 1 to 99.

inheritsched:

  • PTHREAD_EXPLICIT_SCHED: scheduling policy for newly created thread determined by schedpolicy and schedparam.

  • PTHREAD_INHERIT_SCHED: inherited by parent thread.

scope:

  • PTHREAD_SCOPE_SYSTEM: threads contend for CPU time with all other processes runing on the machine (thread priorities are interpreted relatives to other processes priorities.)
  • PTHREAD_SCOPE_PROCESS: contention occurs with other threads of the running process. No supported on Linux.
Default attribute values



































AttributeValue
Threads
detachstatePTHREAD_CREATE_JOINABLE
schedpolicySCHED_OTHER
schedparam0
inheritschedPTHREAD_EXPLICIT_SCHED
scopePTHREAD_SCOPE_SYSTEM
Mutexes
mutex kindPTHREAD_MUTEX_FAST_NP

Condition variables
cond_attrValue ignored

Pitfalls:

Race conditions
Execution results depends on code execution order (code in different thread.) Circumvent with synchronization.
Deadlock
thread1 acquires a lock1, thread2 acquires lock2. thread1 blocks acquiring lock2 and thread2 blocks acquiring lock1. Thread1 can't unlock lock1 as its blocked on lock2, thread2 can't unlock lock2 as its blocked on lock1. Deadlock ensues.
Livelock
Two or more threads change they states in response to changes in other thread or threads, without doing any usefull work. Differ from a deadlock in that neither tread is blocked or waiting for anything.
Priority inversion
Low priority running thread synchronizes access to a ressource with high priority thread. When high priority thread runs, it can't because low priority hasn't released the ressource. This give a chance for a medium priority tasks to run, preventing low priority tasks from running to release what blocks the high priority tasks. As a result, high priority tasks doesn't run when it should and what it controls doesn't happen.

Solution is:
  • Priority inheritance: lower priority temporarily inherits higher priority of high priority thread that locks on its ressource, givin it a chance to run when medium priority would have ran. Requires OS help.

  • Priority ceilings: associates a priority with each resource, transfered to the accessor of the resource + 1.
What's Linux specific

Document the NP stuff.

Does nothing:pthread_mutexattr_destroy, pthread_mutex_destroy

Error code
ESRCH is used for invalid thread scheduling parameter specification.

Monday, November 15, 2004

iSCSI, FCIP and iFCP.

The Fibre Channel protocol is a gigabit networking technology specifically developed for interconnecting servers and storage devices. Fibre Channel is based on the SCSI protocol.

iSCSI, FCIP and iFCP are storage protocols designed to use existing IP technologies, features and infrastructure:
  • iSCSI encapsulate SCSI commands into TCP/IP traffic.
  • FCIP encapsulate FC for transport over TCP/IP sockets. It's a tunneling protocol and it extends the FC fabric. FC services are kept intact. It relies on gateway to perform the encapsulation.
  • iFCP replace lower-layer Fibre Channel transport with TCP/IP. FC transport services are being mapped to TCP/IP. This operation is realized by iFCP gateways. Once on IP, TCP/IP routing and switching can be used
I need to read two white papers to clarify a couple of things: one on iFCP and one on FCIP.

Labels:

NAPI.

These are interesting notes about NAPI. In short, the idea is, as far as receiving packets is concerned:
  1. To generate interupts when a first packet arrives
  2. To disable interupts for the device
  3. To let softirqs poll for remaining packets.
  4. Only when the kernel is done with a set of packets are interupts enabled again for the device.
Now packets can be silently droped. Device attribution is on a per CPU basis and mutually exclusive.
Interestingly enough, /proc/sys/net/core/netdev_max_backlog can be set to specify how many packages can be polled in one softirq handler invocation. Default is 300.

The article provides a fair amount of technical details, including a sizing on the number of the thread involved as well as information on some of the call chains.




Labels:

Wednesday, November 10, 2004

MySQL HA

Reading this article today, about MySQL HA. What it takes:
  1. Replication between the MySQL databases
  2. Hearbeat between the two MySQL servers: requires a serial and an ethernet interconnect. The heartbeat configuration point includes: failover policy and authentification.
  3. One virtual IP address to share between the server, through which the application server will talk to the database.
One comment in the afore mentionned link indicates that DRDB could be of some interest -- DRDB allows you to mirror partitions at the block level over the network.

Labels:

Tuesday, November 09, 2004

MySQL replication

Replication: the master writes queries in a binary log that the slave reads and run locally. The replication is asynchronous, and the replication can be filtered and occur at the table level or the database level. In recent (4.x) incarnation of mySQL, reading the binary log and running the queries on the slave are separate tasks for improved performances.

Replication is turned on with the following steps:
  1. A configuration account is created on the slave, binary logs are enable
  2. The master is snapshot and the logs reset
  3. The snapshot is installed on the slave, and replication is configured on the slave
  4. The slave is restarted
Different topologies can be created:
  • In a master/slave topology, the master executes read/write requests. The slave syncs to the master but can also execute read requests
  • In a master/master topology, both execute read/write requests and synchronize mutually.
  • In a load balancing environment, several slaves are used to honor read queries and write queries go to one master the slaves sync to.

Labels:

ARP poisoning, port stealing and MITM.

ARP poisoning:

As explained here, one can craft ARP packets with some destination IP and the attacker MAC address to poison an ARP cache: the existing ARP entry matching the destination IP address is updated with your MAC address. Next time a packet flies to the destination IP address, it gets to you -- you can use packet forwarding to still deliver the packet to destination. Note that since ARP cache entries expires, it may be necessary to send the bogus ARP packet periodically. Different type of ARP messages might have to be used to poison ARP caches of different type of hardware/OS peers.

The ARP protocol is stateless. Most OS update their caches with ARP replies without having ever solicited one. Some, like Solaris, are a little tougher but in this case, one can trigger an ARP request by creating a spoofed ICMP request: you ping a destination with IP and the attacker's MAC to force an ARP request, after what you send the fake ARP reply. Circumventing ARP poisoning can be done through active/passive monitoring, static ARP (not flexible) or Secure ARP.

MITM Attack:

Sniffing would get you there in terms of observing traffic, but it doesn't give you the ability to prevent host IP from seeing its traffic before you do. What you achieve effectively is a MITM: Man In The Middle attack where one can sit listening and relaying traffic between two hosts without the hosts being able to tell. As for the actions the MITM can take, they're numerous: injection (commands, insertion of malicious code in JS, etc... -- sequence number modification required of course), key manipulation, filtering, etc...

Port stealing:

This ARP poisoning based MITM attack works better on a local network segment. In order for it to work in a switched environment, you need to steal the switch ports by sending to the switch packets with the relevant MACs of the hosts you want to intercept traffic from so that the switch modifies its CAM table. Once the traffic is captured, you need to reset the CAM entries to be able to resend traffic to its rightfull destination -- this has to be done for every packet, that's a lot of work and traffic might be dropped. Port stealing can be circumvented by using port security on the switch.

More info on that here. here and here.

Of course, all this rogue packet crafting is done using libnet.

Labels:

Thursday, November 04, 2004

Shell stuff

A couple of shell related things I gathered:

  • Cool guide
  • Bash's `:' forces to return 1. For example, use it as in ` || :'
  • trap ' DEBUG|EXIT' (EXIT overrides all installed signal handlers!)
  • ${X:} stuff. All string operations are interesting but hard to remember
  • s/\(\)/\1/
  • set - Very handy -- look at -x/-u/-n for instance
  • Awk Kownledge Base.

Thursday, October 21, 2004

Message Passing Interface.

MPI stands for Message Passing Interface. It's a standard, not defining a particular implementation. It's not tackling issues like threading and the likes (although it is thread safe.)

MPI applications have to find where they fit within the overall execution (they have a rank) and code hosting several aspect of a a particular task can act differently depending on their rank (think of master/slave scenario.)

Infrastructure code to launch MPI aware processes on a machines can be written using MPI. LAM is one of them.

This seems to be containing interesting tutorial materials on LAM and MPI.

Wednesday, October 13, 2004

JBOSS, AOP

I started to read on JBOSS. There's a lot to digest, I need to gather better pointers at tutorials and documentation. Looking for things JBOSS, I came across Cedric's blog, which has a lot of interesting things to peruse.

Reading on JBOSS got me to read on AOP (Aspect Oriented Programming), especially this introductory article:
  • Deals with orthogonality of objects concerns (what they do)
  • Methodology to decompose concerns and weave (reunit) them into the final object, for greater maintainability. The weaving part is interesting, because it can be done different ways: it could be some program merging the bits together at the source code level. If you're doing it in Java, it can implemented as the bytecode level; or could rely on a dedicated VM to do the weaving (being feed external data.)

    Weaving rules are defined in terms of aspects, which are defined by a combination of join points (any point in a program: method invocation, field access), pointcuts (the language structure defining a pointcut), and advice (pieces of an aspect implementation to be executed at pointcuts.)

  • JBOSS has an AOP. I started to read about it, but I need to resolve a couple of pointers before it become productive reading notably:

    • the new JDK5.0 extensions: Annotations (I must admit I need more documentation on this)

Tuesday, October 12, 2004

Crash only software, Linux RT

Crash only software, an other paper by George Candea.
  • Non volatile state managed by dedicated state store
  • Components have externally enforced boundaries
  • Interaction between components have a timeout
  • All resources are leased
  • Requests are self-describing
This is a level of modularity where modules as watched by a supervisor and crashed when something wrong is detected (for instance a request isn't fullfilled.)

Following some RT linux stuff, after I read about the Monta Vista RT patch announcement.

Monday, October 11, 2004

CS444a : Principles of Dependable Computer Systems

I started to read all articles found in this Stanford course syllabus: CS444a : Principles of Dependable Computer Systems

The basics of dependability:
  • Dependability=RASS (Reliable, Available, Safe and Secure)
  • MTTF+MTTR=MTBF. avail. = MTTF/(MTTF+MTTR)
  • Dependability metrics considered in SLA (Service Level Agreement)
  • Heisenbug, Borhbug, Schrodinbug, Mandelbug
  • Failure categories: Permanent/intermitent/transient failure
  • Fail system behavior: fail-stop, byzantine, fail-fast
  • Dealing with faults: detection, diagnosis, isolation, repair, recovery
I need a way to remember: ACID: Atomicity, Consistency, Isolation, Durability

Tuesday, May 25, 2004

mySQL.

Random stuff about mySQL (I need it because I'd like to implement something equivalent to this:)
  • Manual online (the manual features user comments worth reading)
  • Want try mySQL stuff online? Try sqlzoo.net
  • Database concepts here (External/Conceptual/Internal view and all that good stuff.)
Random note: database size depends on filesystem. On Linux, for database > 2GB, LFS should be present on the system (libc/kernel.) Beware that some filesystem aren't LFS compliant.

Labels:

Tuesday, January 06, 2004

Quick read on CIFS

CIFS Common Internet File System:
  • CIFS is M$ way of doing network file sharing
  • Allows for sharing of file, directories, printers, etc...
  • There are protocols for service announcement, naming, authentication, and authorization
  • The file sharing protocol is Server Message Block (SMB.) originally developped to run over NetBIOS (now available over TCP/IP -- NBT)
  • NBT: three services must be implemented:

    1. Name service (NetBIOS name to IP addresses mapping UDP port 137,)
    2. datagram service (NetBIOS datagram delivery via UDP port 138)
    3. session service (point to point connection oriented NetBIOS sessions over TCP port 139, the traditional transport for SMB.)

  • Communication endpoints are 16-bytes strings called NetBIOS names.
  • Group names can be shared by multiple clients.
  • SMB: Raw (naked) over TCP (port 445) or over NBT session service.

Labels:

Tuesday, August 12, 2003

About SNMP (Simple Network Management Protocol)

I don't remember where I read this from. Here are the notes:
  • SNMP: servers on devices, clients query devices. servers reply to SNMP request if the community string in the request matches the one it expects.
  • SNMP Management Information Base (MIB) defined using ASN.1 format
  • SNMP communication defines a UDP message (get, get-next, set, trap) encoded in PDUs (Protocol Data Units.)
  • SNMP defines a start set of values (defined by string or a numeric identifier in dot-notation)
  • SNMP defines a standard way of adding objects
  • SNMP MIBs are arranged in a tree, with ISO internet on top andour main branches

internet -+- mgmt (standard SNMP object)
|- private (vendor SNMP object)
|- experimental (not really used)
`- directory (not really used)
  • Tree leaves are either discrete MIB objects (`.0' extention to their name) or table MIB objects (`.' extention to their name)
  • MIBS have specific values, defined in SNMP primitive types (text, counter, gauge, integer, enum, etc...) MIB objects have acces values (ro, rw, wo)
I'm adding more to this SNMP stuff:
  • The SNMP FAQ is huge, it's here, it points to a good introduction available as a PDF.

Labels:

Tuesday, June 10, 2003

About VLANs.

Note gathered during some reading about VLANs. See also the vconfig man page.
  • VLAN architecture and frame tagging techniques. 802.1q specifies a tag header following the source MAC address field.
  • VLAN tagged frames carry VLAN id and priority information (priority use is defined in 802.1p)
  • VLANs can be setup as port based, MAC based or Layer3 protocol and address based.
  • CRC (Cyclic Redundancy Check). Remainder of the bit string converted to modulo 2 coefs polynomial, divided by an other pre-defined polynomial (key.) Ethernet uses key 0x04c11db7 (degree-32, 32nd degree coef implicitely 1.)
  • The CRC is added to the message polymomial after multiplication by x^32 and sent,upon recomputation, the result should be zero. More info here.

Labels:

Wednesday, April 02, 2003

On purpose.

Over the course of a year, I've been spending some time reading technical stuff and keeping them into a log file. Now this log file will be converted into a blog because that's the thing to do these days.

So, on to the first post.

Home