So far, we have been making our own sites. I want to touch briefly on the issue of installing software packages on web servers, and some things you might want to know when you do this.
Objectives
- Be able to work in the shell environment.
- Understand some of the basics of installing applications.
Downloading & Unpacking
The idea of open source software is probably not new to you. It still is hard for some people to wrap their head around the idea that a group of people would do a lot of programming and then give it away for free. Chances are, you have had one of two experiences with programming. First, you may have felt that it was worse than dragging yourself over hot coals just to get something simple to work. Second, you might have felt like a powerful wizard when you bent the computer to your will. And not infrequently, people feel both of these things at different times in the process. Either way, it might seem strange to you that many programmers are interested in giving their work away for free.
I’ll leave aside the question of motivations for now, but we can say that some of the free software that is available is of the highest quality–sometimes of much higher quality than commercial solutions. Some of the most important and most popular software that runs on servers is open source software. Several free Unix derivatives are very popular operating systems for servers. Of all the web servers on the internet, more than half run the open source Apache software. (Microsoft’s commercial servers make up less than a third, and the rest are made up of a number of other systems.) Other pieces of web server technology, like the PHP interpreter and MySQL are likewise open source software.
Perhaps more importantly, many web applications are open source. This has a couple of implications. First, many (though not all) open source systems are free of charge, so that you can install them without any expense out of pocket, except for the cost of the server itself. Second, because you can see the programming, you are able to learn from it, change it, and extend it.
Generally, open source projects make their work available from a download site. Some make use of a version control system that allows you to download the most recent changes. The use of such repositories is beyond the scope of this short introduction, but you should be aware of their existence. For large projects, the most recent version of the software is usually made available for download. Often, these files are kept on a service called SourceForge. But in the case of many large projects, there is usually a website or portal dedicated to the project, which includes downloads, documentation, and often discussion boards devoted to the service.
To provide four very widely used examples, you might take a look at Wordpress, Drupal, phpBB, and MediaWiki. These are some of the most widely used community and content management systems on the web, and they are yours for the legal downloading. But what do you do when you have them downloaded?
Well, most include a pretty detailed install file. Look on the site for the installation document, but if you cannot find it, most of the compressed files you download have a text file called README and or INSTALL included, each of which provide information on how to install the software. These can either be pretty involved or pretty simple. Wordpress, for example, prides itself on a “5 minute installation,” which involves little more than downloading, unpacking, and starting up the installation process from the web. Others may require you to set up a database by hand, or edit a configuration file accordingly. This sub-module is designed to give you some hints on how to go about doing this.
Before you can get to those README files, you probably have to unpack the compressed file. Chances are, if you’ve been on the web for a while, you’ve encountered a zip file, and increasingly this software is available as a zip. Most of you have probably zipped and unzipped files before, and there are lots of software packages that will allow you to do that. But many packages are only available as a “tar.”
TAR files came from Tape Archives, back when such things were common. You can untar a file on your local machine, and then upload it using FTP, or you can upload to the server and then login and untar it from the command line (shell).
The most common versions of tarballs (which is what people call files that have been tarred) is something.tar.gz. Taht is a file that has been tarred and then gzipped to make it even smaller. To untar such a file, you can type:
tar -xcfz something.tar.gz
If it isn’t gzipped (that is, if it is just something.tar), you type the same thing, but with the options xcf, rather than xcfz. In either case, this will show you the files being expanded and placed in the appropriate places.
Shell
Of course, to be able to do this expansion on the server, you have to actually have access to the server via a user login. Traditionally, this meant logging on using a telnet connection, but most servers have stopped allowing telnet, and only allow ssh (secure shell) access now. If you are young enough never to have used DOS or ever seen a command line, the idea of typing commands interactively into a computer is probably a bit foreign. Early systems that provided graphical interfaces were simply built on top of command line interfaces, and it was not unusual to “drop down” to the command line to do something serious. This is still the case with most flavors of Linux, and even Macintosh OSs now provide a console that allows you easy access to the command line.
To sign on, you will first need a server than allows you shell access. Not all web hosting companies allow this level of access, because it can be more difficult for them to preserve security in a multi-user environment. Most of the good ones do allow this, because it is sometimes easier to get things done from the shell than it is to make changes on a client machine and then try to manipulate the server through executables and FTP. Once you know you are allowed access to the server command line, you will need a client, like PuTTY that gives you access via ssh.
Once you are in, you need to be able to move around. There are a few vital comments for this. First, you should try to see where you are. Type:
pwd
Even though that looks confusingly like “password,” it is, in fact, “print working directory,” and will tell you where you are in the directory structure. Directory structures are pretty similar across many operating systems, at least at the most basic level. On Unix machines (and their derivatives, sometimes called *nix), the root directory is represented as a single slash: /. This is the “highest” directory, you cannot move up further. In order to move up or down the tree of directories, you type:
cd
or “change directory.” To move up a directory (or go to a directory’s “parent”) you type cd .. . To move down to a child directory, you type cd directory_name, where the directory_name is the name of the directory you want to open.
So, for example, if, when you logged in, you did a pwd and discovered you were in /home/alex, you could enter cd .. and your new directory would be /home. If you entered cd .. again, you would be at the root directory, or /.
(For those of you who *do* remember DOS, you’ll note that the root directory does not include a drive name, like C:/ . For the most part, the physical disk structure is not obvious in *nix systems, though there are ways of attaching and accessing multiple disks.)
In order to determine what directories and files are available in your current directory, you might type:
ls -la
The ls command gives you a list of files and directories in your current working directory. The two options I’ve included above are the “l”, which tells it to give you a full version of the file and its permissions, and “a” which includes hidden files (which are, by default, those that start with a “.”). If this listing is too long, you can “pipe” it through another application, “more”, which gives you one screen of data at a time, using the following:
ls -la | more
If you want to move into a directory that does not yet exist, you will first have to create it, using the mkdir command. You can delete any empty directory using the rmdir command. The rm command allows you to delete files, and using wild cards, you can (either intentionally or accidentally) delete large chunks of the material on the server.
Finally, you can copy files using the cp command, or move them using mv. For each, you just follow with the “from” and “to” filenames.
When you indicate filenames, you can do so either absolutely or relatively; more frequently the latter. So, you might say:
cp alex.php ../alex.php
which means “Copy the file in my current working directory called ‘alex.php’ to a file in my parent directory called ‘alex.php’.” This idea of walking through the directory structure should be familiar to you. When indicating where an image is in an HTML file, you might write
<img src=’images/x.png’ />
which means go into the images folder (relative to my current working directory) and find the file x.png there. You might also see something like:
<img src=’../images/x.png’ />
which means go up one level and find a folder called images. Then look for x.png in that file.
Configuration
One of the things you may be doing in the shell is configuring a piece of software. Many systems include a file that is called something like “conf.php” that contains many of the “options” for the site. These may include information like the name of your database and your database user and password, or the URL of your site, or the location of certain important files. In many cases you can edit this conf file on the server itself. First, it’s a good idea to save a copy of the file, by doing something like:
cp conf.php conf.2009.02.15.bak
This way, if you break anything, it is pretty easy to just restore your old version of the configuration file. Now, if you are editing the file on the server, and if you don’t know which editor to use (and there are great battles over this), you might try seeing if the server has an editing program called pico. Pico is a very lightweight text editor, and is not difficult to understand. Just make your changes, and when finished, hit CTRL-X when finished.
One of the other things you may have to do to get an application installed is to set up a database for it. Generally, this is most easily done in phpMyAdmin when it is available. You can also access mysql from the command line, and create tables and users from there.
Security
In many cases, you will need to set the access permissions on a file or on multiple files. At the command line, you can do this using the chmod command. The command is made up of three parts: the command itself, the requested permissions, and then each file or directory you want to apply it to. Generarlly, the requested permissions are a three digit number, with the first digit representing the permissions for the user, the second for the group, and the third for the rest of the world. This is from 0 to 7 and represents whether something can be read, written, and executed (7), none of those things (0), or somewhere in between. Most files start out at 744, which means the owner can read/write/execute, and everyone else can only read.
To see what permissions are currently in place, do a ls -la The column on the far left indicates whether the file is a directory (d) or not, and then whether the user/group/public can r (read), w (write), or x (execute) this file.
This might be a good time to at least touch on the role of Unix security. We didn’t talk much about the ways in which PHP can get people logged into a site, and give them permission to do certain things. However, for simple security, Apache servers have a special file called .htaccess. If you have ever gone to a site and a pop-up window has asked you for your username and password, you have probably run into this file. .htaccess allows you to enter allowed users and their paired passwords, in a “hashed” format. For more information on setting up .htaccess to secure information on your site from prying eyes, and do many other useful and wondrous things, take a look here.
Cron & Backup
You should assume that your site will fail periodically, and the only thing that will save you is periodic backups. This is often done by running a tar command automatically every so often, and sometimes FTPing that tarball to another system for safe keeping. Databases are also frequently dumped and saved off.
One way to make this happen is by using a service that runs on computers called cron, which executes scripts periodically. So, once you set up a list of commands you want to execute periodically in order to do backups, you would tell cron to take care of this. There are other tasks you might also schedule using cron, like getting feeds from a website.
Unfortunately, many web hosts don’t allow you to use cron. It makes some sense why they might not, since it can be used to run programs on your server even when no one is there. In those cases, you have to do some tricky server-side programming to check–whenever anyone visits your site–whether some task has been performed recently. Many systems are now including these ways of working around the lack of cron, including Drupal’s software.
As a side note, you can also manually download your database tables directly from phpMyAdmin, using the export tab. This allows you to download to Excel, for example, as well as to a SQL file.
Themes/Plugins
Most content management systems allow you to “skin” the page using–at a minimum–CSS. This feels, in practice, a lot like the Zen Garden site. By making sure each section is well marked with DIVs and SPANs, users can focus on using CSS to make the site look the way they want it to.
Many such systems allow you to move beyond just the CSS to edit the templates as well. Templates are generally like HTML files, but usually with PHP (or some other language) being used to indicate where dynamic content should go. So, if you look at the index page from a Wordpress install, it might include something like
[sourcecode language="php"]
this tutorial.
There are also extensions to most of these programs that are called “plug-ins” or “modules.” Many of these can be downloaded and added to the program you are using to give it more functionality that is specific to your needs. It is also possible to write such plugins, particularly in PHP in many cases. The documentation for the project can give you a feel for how to make use of “hooks” that allow your programs to be called by the larger program, and share data with it.
There is a lot more to know about how to install this stuff, and how to do updates, but the above information is enough to get you started. The rest you will learn, inevitably, through trial and error.