Satisfied Programmer

Wednesday, February 4, 2009

Detecting web-crawlers using cookies

This is how I figured out to split incoming traffic into three branches:
1) cache: web crawlers
2) browse: human users that just quickly browse the pages
3) active: human users that actively use the website

Let ActiveKey and BrowserKey be two fixed arbitrary strings.

This is the algorithm:
1. place in a header of all pages a bit of javascript, this javascript does:
1.1 check if browser knows a cookie by name of SID
1.1.1 if it doesn't have one, set one using javascript to a
value BrowserKey+RandomNumber, and
1.1.2 reload the page using javascript window.reload() call
1.2 if it does have it, just leave the page as is

This separates all web crawlers from all human users. Web crawlers are all the ones that don't send a cookie with name SID that contains a substring BrowserKey

2. If a user clicks a button and starts to use the website as an application (for example, puts products into a shopping cart), then the handler of all button clicks go through the same code that changes the state:
2.1. Change the value of the SID cookie to a string of the form ActiveKey+RandomNumber
2.2 Any ajax request or page navigation will now pass this cookie too automatically
This step separated users in category (2) and (3).

What is the point ? The point is that web crawlers can get older cache content. A site with many pages can serve the bots from pregenerated cached files, giving fast response. The web-crawlers can be served by another server altogether.

Also, users that want fast browsing experience, can be served by a separate server. etc...

Wednesday, May 7, 2008

Pliant Web UI

Here is a pliant powered web application:


Wednesday, October 24, 2007

Dualboot CoLinux/Windows setup

This article describes a computer setup. First a standard Ubuntu/Windows dual boot setup is created.
Ubuntu version: feisty. Windows version: Windows XP.

Next, we enable to load the Ubuntu using Colinux (http://www.colinux.org) which lets us load up the ubuntu installation
while we are in windows. The docs for these are here: http://colinux.wikia.com/wiki/Dual_boot_system
There is a problem with passing COLINUX environment variable, because Ubuntu's new init system uses a package
"upstart" rather than "sysv". I made a small patch to make it work (its a hack).


--- process.c-original 2007-10-16 10:36:53.000000000 -0400
+++ process.c 2007-10-16 11:18:11.000000000 -0400
@@ -221,7 +221,7 @@
process_setup_environment (Job *job)
{
char **env;
- char *path, *term, *jobid;
+ char *path, *term, *jobid, *colinux;

nih_assert (job != NULL);

@@ -231,6 +231,10 @@
NIH_MUST (path = nih_strdup (NULL, getenv ("PATH")));
NIH_MUST (term = nih_strdup (NULL, getenv ("TERM")));

+ colinux = NULL;
+ if (colinux = getenv("COLINUX"))
+ colinux = nih_strdup (NULL, colinux);
+
if (clearenv () < jobid =" nih_sprintf">id));
if (setenv ("UPSTART_JOB_ID", jobid, TRUE) <>


Just apply it to process.c file in the upstart source, and rebuild the upstart deb. You can download my upstart.deb here, its good for Ubuntu feisty: http://archimedes.hypervolume.com/~boris/upstart

Following the colinux wiki howto, I create /etc/init.d/colinux script, added it to rc2.d, as S01colinux, so that it starts first,


#!/bin/sh
# colinux use either a set of coLinux-related configuration
# files, or their direct-boot counterparts
# also, touch file /var/local/colinux if we are in colinux mode

if [ $COLINUX ]; then
SUFFIX=colinux
for i in 0 1 2 3 4 5 6 7; do mknod /dev/cobd$i b 117 $i; done
else
SUFFIX=non-colinux
fi

for conf_file in "/etc/fstab" "/etc/network/interfaces" "/etc/resolv.conf" "/etc/asound.conf"; do
cp -f $conf_file-$SUFFIX $conf_file
done

if [ $COLINUX ]; then
touch /var/local/colinux
else
rm -f /var/local/colinux
fi

exit 0


These are the one-liners that I added at the top of each file that I disable in Colinux mode.


boris@freedom:/etc/init.d$ grep colinux *
acpid:test -f /var/local/colinux && exit 0
acpi-support:test -f /var/local/colinux && exit 0
apmd:test -f /var/local/colinux && exit 0
bluetooth:test -f /var/local/colinux && exit 0
gdm:test -f /var/local/colinux && exit 0
hotkey-setup:test -f /var/local/colinux && exit 0
nvidia-kernel:test -f /var/local/colinux && exit 0
pcmciautils:test -f /var/local/colinux && exit 0
powernowd:test -f /var/local/colinux && exit 0
vbesave:test -f /var/local/colinux && exit 0


Notice that colinux script installes copies of fstab, asound, and resolv.conf.
Here they are:


fstab-colinux:

/dev/cobd0 / ext3 defaults,errors=remount-ro 0 1
/dev/cobd1 none swap sw 0 0
cofs0 /mnt/data cofs defaults 0 0


My windows C: is NTFS, so I created D: as FAT32 which I share as a cofs0 device. (See cofs.txt in coLinux directory).
The fstab-non-colinux is the fstab you had from Ubuntu installation.

The reason to change asound.conf is to forward all sound over to a windows based sound daemon, called pulseaudio. You gonna need to install it on the windows end.


asound.conf-colinux:
---8<------
pcm.!default { type pulse }
ctl.!default { type pulse }
pcm.pulse { type pulse }
ctl.pulse { type pulse }
---->8----


The asound.conf-non-colinux is empty.

Before I show resolv.conf lets talk about Networking. Colinux creates a TAP device, which I use for linux-windows communication only. To communicate from colinux to the internet I use "Slirp" device. The reason is that this way if you setup VPN on windows to your work, colinux will be able to access it too, without any config. "Slirp" works good for internet, but too slow for local communication. That's why I use TAP, with the IP 10.0.0.1 on windows, and 10.0.0.2 on linux. Note that, "Slirp" issues
10.0.0.2.* subnet ips, so its a different subnet. Read this for more info: http://colinux.wikia.com/wiki/Network#Accessing_windows_based_VPN

Here's my resolv.conf-colinux,


boris@freedom:/etc/init.d$ cat /etc/resolv.conf-colinux
nameserver 10.0.2.3
nameserver 192.168.5.10
nameserver 192.168.5.12


The first nameserver is due to the Slirp device. The rest are nameservers of my company, after I VPN into it on windows side.


Okay, so that covers the basic colinux setup. Next comes the X configuration. Download Xming from its sourceforge page.
If you go to authors page the download link requires a donation of 10 euros. You can download Xming-fonts from authors page, though.
Install Xming, Xming fonts.

Download puttygen, pageant, putty, and plink. Setup public/private key according to putty documentation.
Start pageant, and add your key into it.
Add a shortcut to pageant to startup, and give command line arguments to pageant the path to your private key.
Verify that plink user@10.0.0.2 doesn't require a password to login, where 10.0.0.2 is the colinux ip.

Start Xlaunch, which is a program that comes with Xming. Select "fullscreen", Putty and fillin all the params.
For application that needs to be launched type the path for "stumpwm" on your colinux system. Xlaunch will create
a shortcut on your desktop to start the X/StumpWM.

To automate startup of colinux more, download 'clm' tool for colinux (http://colinux.wikia.com/wiki/Clm),
and add a service for your colinux. Then create a start_colinux.bat with (net start "colinux") which you can place on your desktop.
This way you will not have a colinux console open all the time. If you want to attach a console to a running colinux,
create a shortcut to colinux\clm.bat with command line arguments "open fltk" and place it on your desktop.

Add keyboard shortcuts to all your windows apps. That is found in the properties of the shortuct. I have,


- Ctrl-Alt-X : start X
- Ctrl-Alt-C : start command line (configure the font and dimennsions so it fills up the full screen)
- Ctrl-Art-R : show colinux console (open fltk stuff).


Todo: how to automate starting of the sound server.

Hot laptop that doesn't heat

TC1100 is a tablet pc, with 1.1Ghz intel CPU, 40gb hard drive. It gets hot, but not at the keyboard. This is because it is designed to have all internals in the monitor. This makes the monitor heavy, but it doesn't tip over because of smart design.


TC1100 YouTube Video

TC1100 Images



My TC1100 has 1gig ram. I also bought a docking station, which has a DVD/RW drive. The DVD drive is very slim, and can be attached directly to the laptop using a USB cable (no docking station).

In addition of being great coding machine, this laptop is great to do homework on and convert to PDF. It is possible to write comments ontop of existing PDFs too, with program jarnal on Linux (don't know an windows). Windows has a very good support for Tablet PC.


The laptop has builtin mike, and in addition to stardard speakers/mic jacks, it has a jack like the cellphones do. So you can plugin your cellphone headset and speak comfortable. It also has Bluetooth! I haven't tried it yet, but I may be able to use one of those wireless bluetooth headsets to chat over skype.


The webcam that I bought for the laptop is Logitech QuickCam for Notebooks which has a large enough clamp to attach to the fat TC1100 display. (Remember, all internals are in the display, so keyboard is very thin, but display is not).


The TC1100 has no heat problem when you work on it as a laptop. However, if you use it as a tablet pc, you will feel the heat from the monitor, which is annoying. But that happens with every laptop and tablet pc.


Now, the software that I run: I have Windows XP and Ubuntu installed in dual boot fashion. I acccess Ubuntu standalone, or through "colinux", while I am still in Windows. It works very well, and the setup deserves another post. (Dualboot Colinux/Windows setup)

The price of this laptop on eBay is now at $700 with docking station and dvd. HP/Compaq stopped producing laptops with this design (keyboard has no internals), and there is a petition that can be signed online to resurrect this laptop. Basically, a faster CPU is what will be needed in the future. Its ok for now.






TC1100 graphic

Do You Love or Hate
TC1100?
The TC1100 love meter: TC1100 love meter