Monthly Archives: December 2014

Tales from the factory – curl vs wget

open_book_02

 

From time to time I will share some stories based on true events, maybe someone will learn something from them. Then again, maybe not. To protect the innocent, some names and events might be edited. Here comes the first one.

 

Someone raised a ticket that their application cannot access a certain url, let’s say “http://My.url.tld”. You dutifully log in to the system in question and try to access the url. Since the app is using the “libcurl” library, you naturally try to test with the respective utility. You confirm that it does not work:

[user@someserver ~]$ curl http://My.url.tld
Error message.
[user@someserver ~]$

In the same time a colleague also sees the ticket but for some reason he does the testing by the way of “wget”. It’s working for him:

[user@someserver ~]$ wget http://My.url.tld
Correct result.
[user@someserver ~]$

You go back and forth with “it’s working”, “no, it’s not” messages until both of you realize that you test differently. So, it’s working with “wget” but not with “curl”. Baffling. What could be wrong ?

After running both utils in debug mode you spot a minute difference:

[user@someserver ~]$ curl -v http://My.url.tld
* About to connect() to My.url.tld port 80 (#0)
* Trying 1.1.1.1... connected
* Connected to My.url.tld (1.1.1.1) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl
> Host: My.url.tld:80
> Accept: */*
>
<
< Error message
<
<
* Connection #0 to host My.url.tld left intact
* Closing connection #0
[user@someserver ~]$

[user@someserver ~]$ wget -d http://My.url.tld
DEBUG output created by Wget 1.12 on linux-gnu.
Resolving My.url.tld... 1.1.1.1
Caching My.url.tld => 1.1.1.1
Connecting to My.url.tld|1.1.1.1|:80... connected.
Created socket 3.
Releasing 0x000000000074fb60 (new refcount 1).

---request begin---
GET / HTTP/1.0
User-Agent: Wget (linux-gnu)
Accept: */*
Host: my.url.tld:80
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 200 OK

Correct answer

---response end---
200 OK
Registered socket 3 for persistent reuse.
Length: 242 [text/xml]
Saving to: “filename”

100%[=========================================================================================================================================================================>] 242 --.-K/s in 0s

“filename” saved [242/242]
[user@someserver ~]$

Have you seen it ?

.
.
.
.
.
(suspense drumroll)
.
.
.
.
.
.
.

Turns out that wget is doing the equivalent of an tolower(“url”) so in the actual http request it’s sending “Host: my.url.tld” and curl it’s just taking what I specified in the command line, namely “Host: My.url.tld”. Taking the test test further it turns out that calling curl with the “only lowercase” url is producing the expected results (i.e. working).

I know what you are thinking, it should not matter how you call an hostname. True. Except that in this story there is an load balancer in the way, who tries (and mostly succeeds) to do smart stuff. Well, it turns out that there was an host-based string match in that load balancer that did not quite matched the mixed-case cases.

But a question remains. What is the correct behavior ? The “curl” or the “wget” one ? I lean on the “curl” approach but maybe I am biased. What do you think ?

Winds of change

terminal

 

Three months have past since I did an (rather abrupt) shift in the focus of my career. Specifically from net to sys. I’ve learned a lot since then and I encountered a very different set of challenges than in the last 15 years. Fun.

 

Unfortunately that also meant that I did not have too much time to cater for this page. I finally managed to put all the things in order in my mind and chill a bit. So from now on I will talk less about networking stuff and more about systems stuff.

To signal this I decided to reflect the change also in the name and subtitle of this site.

So, goodbye packets, say hello to processes.

Later edit:
After two and a half years I’m back to net. ‘Nuff said.