Skip to content

Category: Tips

SSH power unleashed – Part 2: use SSH agent as an authentication provider

In the first part of this serial posts I’ve introduced SSH key and agent and their applications. In this one I’m going to take it further, let’s see how we can make use of SSH agent authentication to make things easier in other situations while still keep it secure.

A pain point of sudo

Sudo, the so commonly used tool that probably don’t need any introduction here. I find myself in dilemma while tryhing to keep secure AND convenient at the same time. By default, when you setup sudo to allow some user to own the root power, you ask him to authenticate himself by enter this password when he fires up sudo. It is usually set up by putting a line as below in /etc/sudoers file.

username        ALL=(ALL)       ALL

People like me may set up password very long so every time I ran sudo I have to type in that in a finger dance. Yes I’ve mentioned that in the last post already and I solved that by using SSH agent. Hint: it is going to help us again this time.

Some people trying to solve this problem by setting up sudo like below instead.

username        ALL=(ALL)       NOPASSWD: ALL

The NOPASSWD part makes sudo skip asking password for username when he issues sudo command, and grant him the root power silently. So as you already find out, although it is convenient, it is a huge risk – if the user accidently got hacked, the hacker is able to get root power like a piece of cake.

Let’s take the SSH agent authentication method further beyond just SSH connection

So you may wonder as I had before – we already have a way to authenticate ourselves to SSH command via the SSH agent, and skip password input in a secure way. Can we make use of that same facility on sudo? The answer is yes!

First we need to install the great pam_ssh_agent_auth package. This package provides the capability to authenticate via SSH agent in the PAM framework(I will cover that detail later in this post). This package is included in the repository of many Linux distros so just install it with your favorate package manager. As what I did on a CentOS server:

yum install pam_ssh_agent_auth

Then put a line in the file /etc/pam.d/sudo

auth sufficient file=~/.ssh/authorized_keys          <--- This is the line to put in
auth       include      system-auth
account    include      system-auth
password   include      system-auth
session    optional revoke
session    required

Make sure the line is located above other “auth” lines, like what I did in above example. This line tells PAM that when sudo is trying to authenticate a user, first try to use the pam_ssh_agent_auth module. If it succeeded, the user is authenticated and get sudo power. If it failed, try the next authentication method, the global system-auth method, which in most case would be asking for password.

The file=~/.ssh/authorized_keys parameter in that line tells the pam_ssh_agent_auth module to verify the user SSH agent against the public keys stored in his own home directory. You can also change that to some other file path, say, if you would like sudoers to authentication against an admin managed dedicated SSH key for sudo.

Once it is done, follow the same instructions in the last post regarding SSH agent and agent forwarding. Then you can sudo on a remote server without needing to input the password, while if you accidentally get hacked, the hacker won’t be able to sudo as he don’t have your SSH key.

The magic behind

So what happens behind all this? The major parts in this magic are: SSH agent, PAM, and the pam_ssh_agent_auth module.

I have already talked about SSH agent so I won’t repeat it. The pam_ssh_agent_auth module connects PAM and SSH agent. The key here is PAM.

What is PAM? PAM means Pluggable Authentication Modules. Here is what PAM says about itself in it’s man page.

Linux-PAM is a system of libraries that handle the authentication tasks of applications (services) on the system. The library provides a stable general interface (Application Programming Interface - API) that privilege granting programs (such as login(1) and su(1)) defer to to perform standard authentication tasks.

The principal feature of the PAM approach is that the nature of the authentication is dynamically configurable. In other words, the system administrator is free to choose how individual service-providing applications will authenticate users. This dynamic configuration is set by the contents of the single Linux-PAM configuration file /etc/pam.conf. Alternatively, the configuration can be set by individual configuration files located in the /etc/pam.d/ directory. The presence of this directory will cause Linux-PAM to ignore/etc/pam.conf.

My own sumary for PAM:

  • PAM provides a system with plugin capability, which is easy to extend for both developers and system admins. A plugin is called a module.
  • PAM provides a universal way for applications to use different methods for authentication provided by different modules. The application don’t need to change if the system provides password authentication in the beginning while later adds fingerprint authentication.
  • Each authentication method provided by PAM can be enabled / disabled / configured individually without interfering each other.
  • Different modules could be linked to gether to provide parallel or serial path of authentication flow.

PAM is a fundamental part of Linux system for many years. And many modules have been developed in this framework. The pam_ssh_agent_auth module I mentioned here, is one of them.

Sudo command, like many other applications in Linux, make use of PAM for user authentication. The file /etc/pam.d/sudo controls how sudo will make use of PAM for user authentication. Before we put the line in, it defaults to system-auth, which on most system usually is password authentication(check /etc/pam.d/system-auth if you are interested in what it does on your system). After we put the line in, sudo will first try to execute what that specific line instructs.

What the configuration line I put in /etc/pam.d/sudo does are explained below:

auth - Tells PAM this line is about a method of how to authenticate a user.
sufficient - Tells PAM that the user will be considered successfully authenticated if he passed this one, no need to try othe "auth" line after this one. - Tells PAM to load and excute this module.
file=~/.ssh/authorized_keys - The parameter to the module, the meanning is already explained above.

So when a user fires up sudo, it calls PAM for authentication, PAM then look into the file /etc/pam.d/sudo and decide to try pam_ssh_agent_auth module first. This module then interacts with the SSH agent and verifys the priviate key information provided by SSH agent against the public keys in ~/.ssh/authorized_keys. If the verification susccess, the module turns back to PAM system and tells it the authentication is successful. As PAM sees the sufficient options here, it considers the whole authentication process is successful and sudo can move on. If the user has no SSH agent running, or the agent is not able to provide the correct key, the authentication attempt of this module failed, and PAM moves on to the next line, which allows the user to still authenticate with his password.

More than that

This approach is not specific to sudo. It can be enabled for any program that makes use of PAM for user authentication. So just open your mind and find out what makes you keep typing passwords. If it uses PAM, congratulations, you may save your fingers!

1 Comment

Discourse SSO login/logout state synchronization tips

Discourse provides SSO integration funtionality which allows using account from an external service to log in. The official Discourse SSO document already provided detail of how to setup Discourse to integrate with external SSO provider. That one should be read very carefully.

Here I’m going to talk about how to made the login/logout state synchronized between Discourse and the reset part of the website(the SSO provider). This is quite important if you would like your visitors to have a good and consistent experience on the site. Without proper setup, the experience is bad in several ways, as described below in each section.

The term “website” in this post refers to the part of the site that is not backed by Discourse. It is the SSO provider for Discourse. If you are using WP or some other site builder, it is similar to implement.

Synchronize state from website to Discourse


When a user is logged in on the website, the URL that link to the forum site should be set to

If your site allows anonymous browsing, make sure you detect the user login state and only append the /session/sso part for logged in user. For anonymouse user just direct them to

Without this setup, when a user navigates to Discourse, he will need to click the Discourse login button to login.


When a user logout on the website, the website need to send a Discourse API request to this URL:{user_id}/log_out

The user_id here is the user ID in Discourse, it maybe different from the user ID the website uses.

How to get the user_id of Discourse

Quote from the official SSO document.

User profile data can be accessed using the /users/by-external/{EXTERNAL_ID}.json endpoint. This will return a JSON payload that contains the user information, including the user_id which can be used with the log_out endpoint.

So the website need to send the the user ID as above and get back the Discourse user ID, then logout the user.

Without this setup, when a user logs out from website, he is still in login state on Discourse. If he navigates back and forth between the site and Discourse, he will see unmatched login state, very confusing.

Synchronize state from Discourse to website


No special handling required. When SSO is turned on in Discourse and a user click login button in Discourse, he will be redirected to website login page. Once logged in, he will be in logged in state on both website and Discourse.


The website needs to implement a logout URL such as

In Discourse site settings -> users -> logout redirect, fill in that URL. Then when user logs out from Discourse, he will be redirect to the website and log out there also.

Without this, when a user logs out from Discourse, he is still in login state on website.

After followed above tips, your user will have consistent login/logout state on the website.

I’ve also posted this on

Leave a Comment

SSH power unleashed – Part 1: use private keys

I’m going to talk about some advanced tips regarding SSH in this series. Many people use it on a daily basisi, yet still only use it’s very basic functions: log into a server, do some work, log out. But SSH actually is very feature rich and flexible. It can even do something many people don’t heard before. Let’s first start with the well known public and private key authentication function.

Tip 1: never login use password. Use public key authentication instead.

The first rule of running a production server is , disable root remote login and password login for normal users. Make sure “PermitRootLogin=no” and “PasswordAuthentication=no” are in your SSHd configuration file, usually /etc/ssh/sshd_config.

You may also want to lockout your root passwd. This adds extra protection to your root power. it can be done by below command:

passwd -l root

Once root password is locked, no one can login as root with password. The only way to become root is to either login use some other authentication method, such as SSH public key, or use su/sudo to elevate from a normal user.

Now it is time to get a pair of public and private key for yourself to login. On a Linux machine, this can be done with below command.

Note: these should be done on your local machine, not your server. NEVER put your private key on remtoe server!

ssh-keygen -t rsa -b 2048

Then follow the on screen message to provide file name to save the keys(the default is OK), and the passphrase.

If you want ultra-secure keys, just raise the key bits given for -b option. 2048 is sufficient for nowadays, 4096 may give you longer confidence.

Now look at your ~/.ssh/ directory, your new keys are there. The public key file is named, and the private key file is named id_rsa. The public key file can be disclosed to the world, while the private key file should be kept safe, as safe as how you keep the key of your home, may be even more 😉 .

Now upload the public key file to your remote server, and make it useable for SSH. It can be done as below.

cat >> ~/.ssh/authorized_keys

This puts the content of your newly generated public key into the authorized_keys file, which will be check by SSH upon your login.

Now login from your local machine to the server with key authentication, you need to provide the passphrase of the private key instead of your system account password. And you can safely disable password login in SSHd configuration now.

Tip 2: use ssh-agent to cache your keys

Imagine that you manage a few servers, and you need to frequently login to perform some tasks. It’s going to be a pain to enter the password of the keys every time. It is quite boring for myself – my password is quite long and made of non-sense characters, everytime I enter the password it feels like a finger dance!

The ssh-agent command is made for such purpose. It can be used to invoke a shell and cache yoru private key in memory. Next time when a SSH key is requested, it will provide the data directly without requiring your to enter the password again. Let’s see how to do that.

First start a shell by ssh-agent:

ssh-agent /bin/bash

It seems nothing happend, you are just dropped back to shell prompt. But actually you are in a newly invoked shell now. In this shell, ssh-agent caches the private key password for you.

Now load your private key(s).


It will prompt you the file name of the current private key it going to load, ask you to input the password. Once you are done, the priviate key and its password is cached in memory. Now try to log into your remote server which has the public key installed. You will notice that no password is required during SSH login!

Tips 3: start ssh-agent upon login

Above tip makes life easier, let’s move on to see how we can make it even more easier.

Instead of start ssh-agent every time after you logged in your local machine, it is possible to start it automatically. If you are using Bash as your login shell, put below at the end of your ~/.bashrc file.

eval $(/usr/bin/ssh-agent -s)


And below at the end of your ~/.bash_logout file.

if [ -n "$SSH_AGENT_PID" ] ; then
    /usr/bin/ssh-agent -k

During your login, you will notice the message of “Agent PID xxxxx”, which means your newly added code in ~/.bashrc just ran and started ssh-agent and setup the environment for you. Then the ssh-add command asks you to load your private keys. Type in your password and your ssh-agent is up running just like described in Tip 2. You can SSH login to remote servers without typing in password.

When you logout, the code in ~/.bash_logout will make sure the agent is killed so no key data is left in memory.

Tip 4: use SSH agent forwarding

Still remember that I mentioned above the private key should be kept safe and never uploaded to any remote servers? What if you need to log into another server(let’s call it server B) from your remote server(let’s call it server A)? You don’t have the private key on server A. To log into server B, you have to either use password authentication, or from your local machine that have the private key. But what if you do need to connect from server A to B and don’t want to use the less secure password authentication way?

This is how SSH agent forwarding comes to save you. With this technology, you an “forward” the encrypted information of your private key located on your local machine, via server A, to server B, without actually copying the key file to server A. Let’s see how we can do this.

When you start the connection to server A, first make sure you followed Tip 2 & 3 and in the ssh-agent shell with private key loaded, then use the -A option like below:

ssh -A -p <remote_ssh_port> [email protected]_A

The -A option tells SSH program to forward local ssh-agent to server A. When you use SSH there, it can acccess the private key via the secure SSH connection between your local machine and server A.

Now if on server A you need to connect to server B, just run SSH and you will notice that no password is required, you just log into server B with your private key on your local machine!

If you are using PuTTY on Windows as your SSH client, the agent forwarding option is under Connection -> SSH -> Auth -> Authentication Parameters -> Allow agent forwarding, as shown in below screenshot.

Caution: always make sure server A is trusted before you forward your agent to it.

Leave a Comment

Handy cURL shell script for http troubleshooting

The great cURL tool

Many people know about the famous cURL tool. For those don’t know yet, here is the introduction from it’s own man page.

curl is a tool to transfer data from or to a server, using one of the supported protocols (DICT, FILE, FTP, FTPS, GOPHER, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET and TFTP). The command is designed to work without user interaction.

curl offers a busload of useful tricks like proxy support, user authentication, FTP upload, HTTP post, SSL connections, cookies, file transfer resume, Metalink, and more. As you will see below, the number of features will make your head spin!

Many developers, system admins, tech support people and users use it on a day to day basis. The typical way of using it is to view HTTP connection detail such as request and response headers, very handy.

Wait, are you really using it in a great way?

But many people never notice a powerful option cURL provided, the “-w” option. Here are some key information regarding this option from its man page.

-w, --write-out Make curl display information on stdout after a completed transfer. The format is a string that may contain plain text mixed with any number of variables. The format can be specified as a literal "string", or you can have curl read the format from a file with "@filename" and to tell curl to read the format from stdin you write "@-".

Some really useful variable “-w” option supports:

size_download The total amount of bytes that were downloaded.
size_request The total amount of bytes that were sent in the HTTP request.
size_upload The total amount of bytes that were uploaded.
speed_download The average download speed that curl measured for the complete download. Bytes per second.
speed_upload The average upload speed that curl measured for the complete upload. Bytes per second.
time_appconnect The time, in seconds, it took from the start until the SSL/SSH/etc connect/handshake to the remote host was completed. (Added in 7.19.0)
time_connect The time, in seconds, it took from the start until the TCP connect to the remote host (or proxy) was completed.
time_namelookup The time, in seconds, it took from the start until the name resolving was completed.
time_pretransfer The time, in seconds, it took from the start until the file transfer was just about to begin. This includes all pre-transfer commands and negotiations that are specific to the particular protocol(s) involved.
time_redirect The time, in seconds, it took for all redirection steps include name lookup, connect, pretransfer and transfer before the final transaction was started. time_redirect shows the complete execution time for multiple redirections. (Added in 7.12.3)
time_starttransfer The time, in seconds, it took from the start until the first byte was just about to be transferred. This includes time_pretransfer and also the time the server needed to calculate the result.
time_total The total time, in seconds, that the full operation lasted.

Try this!

Let’s put these together in a shell script, say


set -e

              Downloaded (byte)  :  %{size_download}
            Request sent (byte)  :  %{size_request}
                Uploaded (byte)  :  %{size_upload}

       Download speed (bytes/s)  :  %{speed_download}
         Upload speed (bytes/s)  :  %{speed_upload}

            DNS lookup time (s)  :  %{time_namelookup}
  Connection establish time (s)  :  %{time_connect}
           SSL connect time (s)  :  %{time_appconnect}
          Pre-transfer time (s)  :  %{time_pretransfer}
              Redirect time (s)  :  %{time_redirect}
        Start-transfer time (s)  :  %{time_starttransfer}

                 Total time (s)  :  %{time_total}


exec curl -w "$curl_format" -o /dev/null -s "[email protected]"

Then we can call this script in such way:


              Downloaded (byte)  :  251340
            Request sent (byte)  :  121
                Uploaded (byte)  :  0

       Download speed (bytes/s)  :  2483400.000
         Upload speed (bytes/s)  :  0.000

            DNS lookup time (s)  :  0.000111
  Connection establish time (s)  :  0.000559
           SSL connect time (s)  :  0.000000
          Pre-transfer time (s)  :  0.000623
              Redirect time (s)  :  0.000000
        Start-transfer time (s)  :  0.023913

                 Total time (s)  :  0.101208

We have a pretty view of how may data are transferred and how much time spent, isn’t it nice?

Next time you need to diagnostic some HTTP issue, besides of the regular curl command you used to run, don’t forget to give this one a try. I use it a lot, hope you will find it helpful as well.

http connection timing, http connection troubleshooting, http connection diagnostic, curl advanced tips
Leave a Comment

What is d_type and why Docker overlayfs need it

In my previous post I’ve mentioned a strange problem that occurs on Discourse running in Docker. Today I’m going to explain this further as this problem could potentially impact any Docker setup uses overlayfs storage driver. Practically, CentOS 7 with all default setup during installation is 100% affected. Docker on Ubuntu uses AUFS so is not affected.

What is d_type

d_type is the term used in Linux kernel that stands for “directory entry type”. Directory entry is a data structure that Linux kernel used to describe the information of a directory on the filesystem. d_type is a field in that data structure which represents the type of an “file” the directory entry points to. It could a directory, a regular file, some special file such as a pipe, a char device, a socket file etc.

d_type information was added to Linux kernel version 2.6.4. Since then Linux filesystems started to implement it over time. However still some filesystem don’t implement yet, some implement it in a optional way, i.e. it could be enabled or disabled depends on how the user creates the filesystem.

Why it is important to Docker

Overlay and Overlay2 are the two supported storage driver of Docker. Both of them depends on the overlayfs filesystem. Below is a picture from Docker document shows how Docker uses overlayfs for its image storage.

Overlay storage driver in Docker

In the overlayfs code(it’s part of the Linux kernel), this d_type information is accessed and used to make sure some file operations are correctly handled. There is code in overlayfs to specifically check for existence of the d_type feature, and print warning message if it does not exist on the underlying filesystem.

Docker, when running on overlay/overlay2 storage driver, requires the d_type feature to functioning correctly. A check was added to Docker 1.13. By running docker info command now you can tell whether your backing filesystems supports it or not. The plan is to issue an error message in Docker 1.16 if d_type is not enabled.

When d_type is no supported on the backing filesystem of overlayfs, containers running on Docker would run into some strange errors doing file operation. Chown error during Discourse bootstrap or rebuild is one common error. There are some other examples you can find in Docker issues on GitHub, I’ve take some for example as below.

Randomly cannot start Containers with “Clean up Error! Cannot destroy container” “mkdir …-init/merged/dev/shm: invalid argument” #22937

Centos 7 fails to remove files or directories which are created on build of the image using overlay or overlay2. #27358

docker run fails with “invalid argument” when using overlay driver on top of xfs #10294

Check whether your system is affected

TL;DR: Ext4? Good. XFS on RHEL/CentOS 7? High chance bad, use xfs_info to confirm

As mentioned above, d_type support is optional for some filesystem. This includes XFS, the default filesystem in Red Hat Enterprise Linux 7, which is the upstream base of CentOS 7. Unfortunately, the Red Hat /CentOS installer and mkfs.xfs command all by default create XFS filesystem without d_type feature turned on…… What a mess!

As a quick rule, if you are using RHEL 7 or CentOS 7, and your filesystem is created by default without specifying an parameter, you can almost be 100% sure that d_type is not turned on on your filesystem. To check for sure, follow below steps.

First you need to find out what filesystem you are currently using. Although XFS is the default during installation, some people or the hosting provider may choose to use Ext4. If that’s the case, then relax, d_type is supported.

If you are on XFS, then you need to run xfs_info command against the filesystem you need to check. Below is an example from my system.

$ xfs_info /
meta-data=/dev/sda1              isize=256    agcount=4, agsize=3276736 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0        finobt=0 spinodes=0
data     =                       bsize=4096   blocks=13106944, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal               bsize=4096   blocks=6399, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

Pay attention to the last column of 6th line of the command output. You can see ftype=1. That’s a good news. It means my XFS was created with the correct parameter, ftype=1, thus d_type was turned on. If you see a ftype=0 there, that means d_type is off.

How to solve the problem

Another bad news is this problem can only be fixed by recreate the filesystem. It cannot be change on an existing filesystem! Basically the steps are:

  1. Backup your data
  2. Recreate the filesystem with correct parameter for XFS, or just create an Ext4 filesystem instead.
  3. Restore your data back.

Let’s focus on step #2. DON’T try any of below command on your server before you fully understand them and have backup secured!

If you chose Ext4 filesystem, then it’s easy, just run mkfs.ext4 /path/to/your/device and that’s it.

If you chose XFS filesystem, the correct command is:

mkfs.xfs -n ftype=1 /path/to/your/device

The -n ftype=1 parameter tells mkfs.xfs program to create a XFS with d_type feature turned on.

Take actions

It is a good idea to check your system asap to see if this d_type problem affects your RHEL/CentOS 7 installation. The sooner you fix the problem the better.


Fix strange chown error during Discourse bootstrap or rebuild


Discourse is an forum software runs in Docker. When using overlayfs/overlayfs2 storage driver, Docker requires the backing filesystem supports d_type. Or else some strange error will just pop up during some very basic file operations, such as chown command.

The symptom

When bootstrapping or rebuilding Discourse on CentOS, the process fails with chown command related errors.

# ./launcher bootstrap app
Pups::ExecError: cd /var/www/discourse && chown -R discourse /var/www/discourse failed with return # Location of failure: /pups/lib/pups/exec_command.rb:108:in `spawn'
exec failed with the params {"cd"=>"$home", "hook"=>"web", "cmd"=>["gem update bundler", "chown -R discourse $home"]}

How to confirm whether you are hit by this problem.

If you are running CentOS 7 and the filesystem was created all by default, you get it. CentoOS installer and the mkfs.xfs command both by default create XFS with ftype=0, which does not meet Docker requirement for filesystem d_type support.

Check the xfs_info command output, mind the ftype=0 thing.

# xfs_info /
naming =version 2 bsize=4096 ascii-ci=0 ftype=0

Then run docker info command to see if Docker pointed it out. You will have to use a new enough Docker version, older ones don’t report d_type information. Below is the output from Docker 1.13.

Storage Driver: overlay
Backing Filesystem: extfs
Supports d_type: false

If you see above in your system, then you are in trouble now. Not only Discourse but other container apps may run into strange problem when they do file operations on the overlayfs. Fix it ASAP!

The solution

The key is to get your Docker a filesystem with d_type support. Unfortunately this option can only be set while creating the filesytem. Once filesystem creation is done, the only way to change it is to:

  1. Backup data
  2. Recreate filesystem
  3. Restore data.

Step #1 & 3 is out of the scope of this post. Let’s focus on step #2, how to create the filesystem in the correct way. Two options exist, use XFS or Ext4.

If you prefer XFS

When you run mkfs.xfs command to create XFS on your partition/volume, make sure you pass the -n ftype=1 parameter. The command line looks like below

mkfs.xfs -n ftype=1 /path/to/your/device

If you prefer Ext4 FS

Ext4 filesystem created with default option supports d_type, so there is no special parameter to use when you create Ext4 filesystem on your partition/volume. Easy!

Tips for Docker and Discourse

Since Docker puts its files under /var/lib/docker directory, you only need to make sure d_type is supported for this specific directory. So if you have free space on your disk, you don’t have to touch your whole root filesystem. Just allocate some space, create a new filesystem with correct parameter, then mount it under /var/lib/docker and it’s done.

As regards to Discourse, this procedure won’t even hurt your Discourse data. Discourse puts all data under /var/discourse/share directory. When you get a new /var/lib/docker directory, only the container definition is gone. You just need to recreate the container with launcher script, then the site will just back to normal.

With that said, backup data before doing any filesystem or disk related operation is still a good practice!


My post regarding this issue on Discourse official forum.

Issue: Centos 7 fails to remove files or directories which are created on build of the image using overlay or overlay2.

Issue: overlayfs: Can’t delete file moved from base layer to newly created dir even on ext4

1 Comment