Prism & Co.: Raising the Shields Is Not Enough

In the past couple of weeks a number of revelations have shown the extent of secret service organizations from around the world tapping the Internet to spy on their citizens and those of other nations, store data about them, record their use of the network and communication metadata such as phone call records. While I think that some of these measures are justified when it comes to counter international crime and terrorism, the line for me is crossed when data of innocent people from around the world is copied and stored indefinitely. Also, wiretapping embassies of other nations and using resources for industrial and political espionage against friends and partners is also something that I find unacceptable. This has to stop and I hope that people and politicians around the world in free and democratic countries will find the courage to control and restrict their secret services and those supporting them and not have their liberty and freedom restricted and undermined by them.

Having said this, I find myself ever more encouraged to protect myself when using the Internet. Using Owncloud to ensure my private data is hosted on my own servers and communicating with them in a secure fashion can only be the first step. I have quite a number of things in mind I want to change over the course of the next months. Watch out on this blog for the details to come.

But raising the shields by storing my data in my own network and encrypting more of my communication is not the cure, it's just treating the symptoms. Privacy and freedom have to come back to communication and only internationally agreed limits to what intelligence agencies are allowed to do on and off the Internet will bring back what we have lost.

Before My First Computer

Busch-blog-smThis blog post is a bit about nostalgia brought about by a current quality time project I am working on. If you were a teenager in the 1980s and read this blog then your experiences might have been similar 🙂

When I was in my teenage years in the mid 1980's, home electronics projects and home computers were the hype of the day for kids fascinated by blinking lights and the mysterious powers of electricity. I must have been one of them because I didn't give up talking about the subject with my parents until they finally gave in and gave me an electronics experiment kit and later on a computer. So before that legendary first C64 that finally arrived one day, I got and bought with hard earned money a number of electronic kits, that culminated in the extension pack you see on the left, the Busch electronics experiment extension kit 2061.

Still available today (in a slightly different color) they were a real enabler for me. Long before I had physics classes in high school, these kits told me about the basics such as voltage, current, resistors, transistors, capacitors, coils, flip flops, timer circuits, radios, integrated circuits, boolean logic, etc. etc. Sure, I heard about some of these things again in high school eventually, but it is the time I spent experimenting with these kits and what I learned then that I still remember vividly.

And now almost 30 years later I still profit from it. Even up to the point that I have put the kits into good use again to prototype a circuit I want to build for use with a Raspbery Pi and a PiFace extension board for some real world interaction. This is kind of electronics 2.0 for me, as this time, it's no longer "only" something on a board that interacts with the real world but extends its reach via the Raspberry Pi over Wi-Fi and into the Internet. Electronics, computers and the net combined. I wouldn't have dreamed of that 30 years ago and I still fascinates me today.

How times have changed. Back then it took hard persuasion or saving money for a long time before I got the first kit and then to buy further extension packs. Today, the world has changed, things have become cheaper and more accessible and if I have an idea that requires additional hardware it can be organized almost overnight via the Internet or going to a local electronics store. Also, money for new stuff is not an issue anymore, which helps tremendously as well.

Are you feeling a bit nostalgic or inspired now and think about getting those experiment kits out of storage again?

Selfoss – How Good It Feels To Use My Own Webservices From Across The Atlantic

Due to Google Reader's imminent demise I've switched over to my self hosted solution based on the Selfoss RSS aggregator running on a Raspberry Pi in my network at home. I've been using it for around two weeks now and it just works perfectly and has all the features I need. And quite frankly, every time I use it I get a warm and glowing feeling for a number of reasons: First, it's because I very much like that this service runs from my home. Second, I very much like that all my data is stored there and not somewhere in the cloud, prone to prying eyes from a commercial company and half a dozen security services. Also, I like that I'm in control and that all communication is encrypted.

Although quite natural today I get an extra kick out of the fact that I am sitting halfway across the globe and I can still communicate with this small box at home. Sure, I've been using services hosted in my home network while traveling abroad such as my VPN gateway and Owncloud for quite some time now but those are always running in the background, so to speak, with little interaction. Reading news in the web browser on my smartphone delivered by my own server at home, however, is a very direct interaction with something of my own far far way.  This is definitely my cup of tea.

French Regulator Says Interconnect Costs Per Subscriber Are Tens Of Cents Per Month

In many countries there's currently a debate fueled by Internet access providers (IAP) who argue that the ever increasing amount of data flowing through their networks from media streaming platforms will lead to a significant increase in prices for consumers. The way out, as they portray it, is to not only get paid by their subscribers but in addition also by the media streaming platforms. In practice this would mean that Google and Co. would not only have to pay for the Internet access of their data centers but in addition, they would also be required to pay a fee to the thousands and thousands of IAPs around the globe.

Unfortunately, I haven't seen a single of these IAP claims being backup up with concrete data on why monthly user charges are no longer sufficient to improve the local infrastructure in the same way as has been happening for years. Also, there has been no data on how much interconnect charges to other networks at the IAPs border to long distance networks would increase on a per user basis. Thus I was quite thankful when the French Telecoms regulator ARCEP recently published some data on this.

According to this article in the French newspaper Le Monde (Google's translation to English here) ARCEP says that interconnect charges per user are typically in the range of tens of cents per user per month. In other words, compared to the monthly amount users pay for their Internet access, the interconnection charge per user is almost negligible. Also, interconnection charges keep dropping on an annual basis so it's likely that effect will compensate the increasing traffic from streaming platforms.

So the overwhelming part of what users pay per month for their Internet access goes toward paying for the costs of running the local access network up to the interconnect point. This means they pay for the facilities, routers, optical cables to the switching centers and from there for the optical cables to street hubs or the traditional copper cables directly from the switching centers to their homes.

Which of these things become more expensive as data rates increase? The cost for the buildings in which equipment are housed remains the same or even reduces over time due to equipment getting smaller and smaller and more centralized so it doesn't go there. Also it's likely that fiber cables do not have to be replaced due to technology improvements that ensure a continuous increase in the amount of data that can be piped through existing cables. That leaves the routing equipment in central exchanges and in street hubs that have to be continuously upgraded. That's nothing new, however, and has been done in the past, too without the need for increasing prices. Quite the contrary.

One activity that is undeniably costly is laying new fiber in cities to increase data rates to the customer premises. Users who take advantage of this, however, are usually paying a higher monthly fee compared to their previously slower connection. And from what I can tell network operators have become quite cost conscious and only build new fiber access networks if they are reasonably certain they get a return for their investment from the monthly subscriber fee.  In other words, this also can't be a reason behind the claim that increasing data rates will increase prices.

But perhaps I'm missing something that can be backed-up with facts?

My Mobile Data Use Last Month

And just a quick follow up to the previous post on my fixed line data use last month, here are some numbers on my mobile data use last month. According to Android's data monitor I've used 367 MB after 439 MB the month before. The number includes:

  • 135 MB for mobile web browsing (due to not using Opera Mini anymore)
  • 55 MB for Google maps (very handy to check traffic on the way from and to work to decide on using alternative routes on a realtime basis)
  • 33 MB Youtube
  • 27 MB for email
  • 20 MB for streaming podcasts.
  • App downloads accounted for 17 MB (new Opera browser)
  • Calendar and address book synchronization required 10 MB

Not included is the data I use for using my notebook on the way to and from work as I use a different SIM card for that purpose for which I have no records. But even if I included that I am pretty sure I would still be well below the 1 GB throttling threshold I have on my current mobile contract.

From a different point of view, however, my mobile data use pales compared to the 70 GB I transferred over my VDSL line at home last month.

Some Thoughts On My Monthly (Fixed Line) Data Use

In July last year I calculated my how many bits per second I consume on average 24/7. My calculation was based on a use of around 30 GB per month and resulted in 92.6 kbit/s. Since then my use has increased quite a bit. Last month I transferred almost 70 GB over my VDSL line at home (63 GB down, 6 GB up).

In addition to what I used my VDSL line at home a year ago I have started using it to tunnel all my mobile data traffic through my personal VPN server at home. I assume that required a significant part of the 6 GB of data that flowed in the uplink direction (plus the same amount in the downlink direction due to the round trip nature of the application). A couple of additional gigabytes come from my increased web radio use with my new Squeezeplug. But the vast increase comes from the much increased use of video streaming services as my XBMC home server has made it a lot simpler and fun to access content.

Only little of the content was HD, however, and average stream data rates were somewhere around 2 MBit/s. That's around 720 MB of data for every hour of video streaming. If 30 GB of my monthly data came from video streaming, that's the equivalent of around 41.5 hours of video. Sounds like a lot but divided by 30 days, that's around 1.4h of video per day.

Now imagine how much data would have been transferred over my line with 2 teenagers at home and at full HD resolution…

GPRS Network Control Order

In the days and age of LTE it seems to be a bit outdated perhaps to write a technical post on a GPRS topic. But anyway here we go since I looked up the following stuff on GPRS network control order in the 3GPP specs lately:

When GPRS was initially launched, mobile devices performed cell reselections even during data transfer on their own and without any information from the network on the parameters of the target cell. Consequently, there was an outage of several seconds during the cell change. Over time networks adopted a feature referred to "Network Assisted Cell Change" (NACC) which comes in a couple of flavors depending on the network control (NC) order.

From what I can tell, most GPRS and EDGE networks today use Network Control Order 0 (NC0). That means that the UE performs neighboring cell measurements during the data transfer and reselect to a new cell on their ow, i.e. without informing the network. If the network indicates that it supports Cell Change Notification (CCN) (in SIB-13), the UE can ask for support for the cell change by sending a Packet Cell Change Notification request message to the network. The network then supplies the system parameters of the new cell to the UE that can then perform the cell reselection much quicker. That's the NACC mode that's pretty much common now in networks.

But there is more. If the network indicates in SIB-13 that the Network Control Order equals 1 (NC1), the UE has to send measurement reports to the network. Cell reselections are still performed autonomously by the UE when required, again by using NACC if the CCN feature is indicated as active by the network.

And finally, there's Network Control Order 2 (NC2) in which the UE has to send measurement reports to the network and only reslects to another cell when told by the network with a Packet Cell Change Order.

I haven't seen NC1 or NC2 in live networks yet but perhaps some of you have. If so, I'd be happy to hear about it.

For the details have a look in 3GPP TS 44.018 and 44.060.

Owning My Data – A Script To Export Images From Typepad

A number of service provider cloud services have been vanishing recently and have in some cases left me without the opportunity to retrieve the data beforehand. Take the Mobile Internet Access Wiki that I started many years ago as an example, as it was just turned off without any notice. I think there is an old saying that goes along the lines that one is allowed to make an error once but not twice. Following that mantra I started thinking which other service provider hosted cloud services I use and how to backup my data – just in case.

The most important one is Typepad, who hosts my blog since 2005. They do a good job and I pay an annual fee for their services. But that does not necessarily mean I will have access to my data should something go wrong. Typepad offers an option to export all blog posts to a text file and I've been making use of this feature from time to time already. There are also WordPress plugins available to import these entries into a self-managed WordPress installation. I haven't tried the later part so far but the exported text file is structured easily enough for me to believe that importing to WordPress is something that can be done. The major catch, however, is that the export does not include pictures. And I have quite a lot of them. So what can be done?

At first I searched the net for a solution and they range from asking Typepad for the images to Firefox plugins that download all images from a site. But none of them offered a straight forward solution to retrieve the full content of my blog including images to create regular backups. So I had a bit of fun lately to create a Python script that scans the Typepad export file for URLs of images I have uploaded and that ignores links to external images. Piped into a text file, that list can then be used with tools such as wget to automatically download all images. As the script could be useful for others out there as well I've attached it to this post below. Feel free to use and expand as you like and please share it back with the community.

Over the years Typepad has changed the way uploaded images are embedded in blog posts and also the directory structure in which images are saved. I have detected four different ways ranging from simple HTML code to links to further HTML pages and Java Script that generate a popup window with the image. In some cases the Python script just copies the URL straight out of the text file while in other places the URL for the popout HTML is used to construct the filename of the image which can then be converted into a URL to download the file. Yes, it required a bit of fiddling around to get this working. This has resulted into a number of "if/elseif" decisions in the script with a number of string compare/copy/insert/delete functions. In the end the script was giving me close to 900 URLs to images and their thumbnails I have uploaded over the years.

And here's the full procedure of how to backup your Typepad blog and images on a Linux machine. It should work similarly on a windows box but I leave it to someone else to describe how to install Python and to get 'wget' working on such a box:

  • Login to Typepad, go to "Settings – Import/Export" and click on "Export"
  • This will start the download of a text file with all blog entries. Save with a filename of your choice, e.g. blog.txt
  • Use the Python script as follows to get the image URLs out of the export file: './get_links.py blog.txt domainname > image-urls.txt'. The domainname parameter is the name under which the blog is available (e.g. http://myblogname.typepad.com). This information is required so the script can distinguish between links to uploaded images and links to external resources which are excluded from the result.
  • Once done, check the URLs in 'image-urls.txt' and make spot checks with some of your blog posts to get a feeling for whether anything might be missing. The script gets all images from my blog but that doesn't necessarily mean it works equally well on other blogs as there might be options to upload and embed images that I have never used and result in different HTML code in the Typepad export file that are missed by the script.
  • Once you are happy with the content of 'image-urls.txt', use wget to retrieve the images: 'wget -i image-urls.txt'.
  • Once retrieved ensure that all files that were downloaded are actually image files and again perform some spot checks with blog entries.
  • Save the images together with the exported text file for future use.

Should the day ever come when I need this backup some further actions are necessary. Before importing the blog entries into another blog, the existing HTML and JavaScript code for embedded images in the Typepad export files need to be changed. That's more tricky than just to replace URLs because in some cases the filename of the thumbnails of images are different and in other cases indirect links and JavaScript code has to be replaced with HTML code to directly embedd thumbnails and full images into posts. In other words, that's some more Python coding fun.

Download Get_links

The Number of Programming Languages I Have Used In The Past 12 Months

From time to time I need to get some things done that require some form programing because they can't be done with an off the shelf program. When counting the number of scripting and programing languages I have used for various purposes over the last 12 months I was surprised that it were et least 8. Quite an incredible number and it was definitely only possible because by using Internet search engines it's possible to quickly find code samples and background information on programing language syntax and APIs on the net. Books might have helped with the syntax but it would have taken much longer. Also, books would have been of little use to quickly find solutions to the specific problems I had.

And here's the list of programing languages I have used in the past year and for what kind of projects:

  • Python for my Typepad image exporter
  • Visual Basic for my WoaS to MoinMoin Wiki converter
  • Open Office Basic to improve a 7 bit to ASCII converter
  • Some bash programing for cron scripts, piping information to text files for later analysis, etc.
  • Zotero scripting to get footnotes into a special format
  • Java on Android for my network monitoring app and for giving an introduction to Android programming in my latest book
  • Assembly language for the deep dive in malicious code analysis
  • C, again for my deep dive in malicious code analysis

Obviously I haven't become an expert in any of those languages because I only used each language for a specific purpose and for a short time. But while their syntax and APIs are quite different, the basic procedural or object oriented approaches are pretty much the same. So I am glad that during my time at university I learnt the basic principles of programming that I can now apply quickly to new programming languages.

Getting Selfoss Running On A Raspberry Pi

This is the propeller-hat follow up post on yesterday's thoughts on replacing Google Reader with a local Selfoss RSS aggregator instance running on a Raspberry Pi. As it's not a 2 minute straight forward installation I thought the results of my efforts might be useful to some of you as well. And as a goodie, I have some tips at the end of how to disable http so only https is exposed to the outside world and how to do http digest authentication to ensure bad guys don't even get to try anything funny in case the Apache web server configuration is not quite water tight. So here we go:

  • Most importantly, use a fresh Raspian image without Apache or PHP already installed
  • Create a folder selfoss in your homefolder (mkdir selfoss) and download the file

mkdir selfoss
cd selfoss
wget http://selfoss.aditu.de/selfoss-2.X.zip into it (2.X -> replace with current version number, check here).

  • Unzip the file

unzip selfoss-2.3.zip

  • Make sure your .htaccess file has:

  "RewriteEngine On" and
  "RewriteBase /selfoss   (double check there's no "#" at the beginning of the line)

  • Change permissions via

chmod a+w data/cache data/favicons data/logs data/thumbnails data/sqlite public/

  • Change the ownership of the "selfoss" folder and all content to www-data

cd ..
sudo chown -R www-data:www-data selfoss

  • Now install apache2 with php:

sudo apt-get update && sudo apt-get upgrade
sudo apt-get install apache2 php5 sqlite libapache2-mod-php5 php5-sqlite php5-gd

  • Enable rewrite and headers:

sudo a2enmod rewrite
sudo a2enmod headers
sudo a2enmod php5
sudo service apache2 restart

  • Change rewrite-settings

in /etc/apache2/sites-available/default and also

in /etc/apache2/sites-available/default-ssl

by setting "AllowOverride All"

  • Change directory to "/var/www" and create a link to your installation of selfoss via

sudo ln -s /home/pi/selfoss

At this point things should be working so give it a try by going to http://localhost/selfoss

Once you are happy with the setup, here are some additionl steps for privacy and security:

  • Password Protect Basic HTTP Access Moin Moin Wiki via Apache Pwd Protection: Put the following Directory configuration in /etc/apache2/apache2.conf somewhere at the end:

<Directory />
AuthType Basic
AuthName "Wiki"
AuthBasicProvider file
AuthUserFile /etc/apache2/pwd.txt
Require user PUT_IN_YOUR_USERNAME
</Directory>

  • For some reason this does not protect the root directory. Therefore create a .htaccess file in the root directory (in  /var/www) and put the same info inside minus the directory xml lines.
  • Give .htaccess the right owner:

sudo chown www-data:www-data /var/www/.htaccess

  • Create the password file referenced above:

sudo htpasswd -c /etc/apache2/pwd.txt PUT_IN_YOUR_USERNAME
  New password: mypassword
  Re-type new password: mypassword
  Adding password for user xxxxx

  • And finally, block port 80 by commenting out 'Listen 80' in /etc/apache2/ports.conf by putting a # in front of the line.
  • Finish with a sudo service apache2 restart

Auto Update The Feeds

  • To always have the content on Selfoss updated put the following line in /etc/crontab to update once an hour at 41 minutes past the hour:

41 *    * * *   root    wget –no-check-certificate –user=xxx –password="xxxx" https://localhost/selfoss/Update

  • Note the upper case 'U' in Update, it won't work with a lower case. 
  • Also note that the no check certificate flag is necessary by the way as we use self generated https certificates that can't be validated.
 * For some reason this does not protect the root directory. Therefore create a .htaccess file in the root directory and put the same info inside minus the directory xml lines

 * The pwd.txt file has to be created by using: