Saturday, December 29, 2012

Using the Vi Editor


VI is probably the most popular text editor for Linux. Even if you don't like it, you may end up using it quite often. If you need to make a quick change to a file, you can't beat 'vi'. This is not meant to be an exhaustive guide to vi. This is just meant to show you how to use the most common (and useful) commands.

Sometimes you need vi
I had an unpleasant surprise once. A friend of mine who had installed Linux had somehow changed the default editor from vi to joe. He called to tell me that his crontab entries didn't do anything. One more reason to get to know vi. Crontab is designed for vi and may not work if you use certain alternative editors.

vi basics
Open file with vi
vi /etc/hosts.allow

Of course, just opening the file gets you nowhere, unless you only want to read it. That can be just as easily done with less, so we'd better look at what to do with 'vi' beyond just opening the file.

Adding text
To add text, you would type the following:
ESC + i

(i for insert)

And there you have it. Now you can type away. Of course, it doesn't really do us too much good if you're not able to save what you've typed. So let's save it, shall we?

Saving the file
ESC + w

(w for write)

Closing the file
Now, to finish you would type the following:

ESC + q 

(q for quit)

Of course, there is a lot more that you may need to do. Let's have a look.

Vi for budding power users
Again, my aim is not to do into a treatise on vi, but here are a few more commands that you might need to do a little more heavy lifting.

Removing Lines
You may find that you need to remove an entire line from a file. Just place the cursor at the beginning of the line and type:
ESC + d

(d for delete)

Changing your mind
Sometimes you wish you hadn't done something. With vi, you can undo what you just did.
ESC + u

(u for undo)

Changing your mind (again)
Again, you have second thoughts. With vi, there are no regrets.
ESC + q!

(q! for quit and I *really* mean it!)

The exclamation point (!) is used in vi to override default settings. If you make changes to a file, vi is going to ask you if you want to save them. 'q!' means that you want to quit without saving.

Where did I put that word?
Did you misplace a word in your file? You can find it easily with vi
ESC + /word

If you're looking for the word nonplussed in your text (as in: 'vi is so easy I am nonplussed') you would type:
ESC /nonplussed

and it will find every instance of the word nonplussed.

Can I change that word?
Maybe you don't want to use the word nonplussed. Perhaps it's better to use a more well-known word. You can use vi to change the word

First you could use the
/nonplussed

to look for that word. When you find it, you would then type
ESC : s/nonplussed/amazed/

to replace the word on that line.

If you were sure that you wanted to replace all instances of that word in the whole text, you could type this
ESC :%s/nonplussed/amazed/g

and nonplussed would be changed to amazed throughout the text.

If you want to get some control over what you replace - that is you want to used both nonplussed and amazed, then you would add gc to the end:
ESC :%s/nonplussed/amazed/g

Vi will now ask you for confirmation.

Vi configuration settings
There are some basic vi configuration setting that you should be aware of if you want to use this text editor comfortably.
Word WrappingIf you don't want your words running off the page into oblivion, you can set the word wrapping feature in vi
ESC : set wm=30

This is a good wrap for me personally. It sets the wrap to 30 spaces from the right so it makes it tight. This means that the bigger the number, the sooner you get a line break. If you send something via email, then this tight wrap ensures that it will get there without the lines being broken into stair-steps.

Vi as an email editor
What did you say? Vi and email? Yes! You can use vi as an email editor. This is most commonly done with the email client mutt.

More Vi
This just scratches the surface of what you can do with vi. Here I've tried to show you what might be the most useful features and setting of vi. It should be enough to get some basic work done with this ubiquitous editor for Linux.

Security Issues

Administering Secure Systems
Batten down the hatches! Even people who seldom log on to the Internet are faced with the possibility that their machine's security could be compromised. Microsoft Windows has been particularly good at attracting all sorts of exploits, worms and viruses. Compared with Windows, Linux appears less friendly to the unwanted. As a matter of fact, several large Linux consulting firms have offered large sums of money to people who could infect a well-maintained Linux system with a virus. But there's the key! The system must be well-maintained. It's imperative to check popular Linux security sites on a daily basis. A good place to start is your distribution vendor/developer's own website.

You must also know that running programs as 'root' must be done as little as possible. However, even taking the appropriate precautions doesn't insure that you won't become the host of the most significant risk facing Linux, the trojan. Named after the famous Trojan Horse from Greek history/mythology, this is simply a program that appears perfectly normal on the surface, but it has in fact been modified to serve as a back-door for entering your system or for using it as a host for attacks against other systems. These trojans don't necessarily have to reside in little used programs either. In August 2002, a "trojaned" version of OpenSSH, the open source version of the secure shell program, was distributed from official download sites. This case is particularly nasty owing to the fact that secure shell is used extensively as a security measure itself. This is like having corrupt policemen on duty. You may be saying to yourself: If I download a popular program from an official mirror, what fault of mine is it if it's been tampered with and I didn't know it. Well, the fact is, as a good system administrator, you have the tools at your disposal to check for tampering.

Checking integrity of packages
Most Linux distributions come with a tool so that you can verify the "authenticity" of a downloaded package. It is called md5sum. If you've produced a finished product, this package will have a certain number of bytes. md5sum calculates the number of bytes and creates a "hash" based on that number. Therefore, if someone were to tamper with the program, they would increase the number of bytes in the packages. If you created a software package, for example, you could make an md5 checksum of that package. You would simply do this:
md5sum my_package > my_package.md5

and publish this on your website. Then people who were interested in using your program could download the package along with the checksum and verify the authenticity of the program. The checksum file is actually nothing more than this:
ac953e19a05816ed2159af092812f1de  my_package

Those who are interested in checking the integrity of the file would type this:
md5sum -c my_package.md5

If the file hasn't been tampered with, you should get a message like this:
my_package: OK

If someone has done some funny business to it, you would theoretically get output like this:
my_package: FAILED
md5sum: WARNING: 1 of 1 computed checksum did NOT match

And it would be assumed that people would get in touch with you to tell you that your program's checksum does not match the package. Someone has obviously increased the number of bytes in the package and its checksum doesn't match. It should be pointed out, however, that any evil cracker worth his or her salt is not just going to substitute the program on your server with a trojaned one. They will most likely provide a checksum to suit their changes as well. As you can see, md5sum is good, but it is not 100 percent reliable for checking the integrity of a package. So how can you be sure? We need to go to a higher level of reliability.

GnuPG
GnuPG is encryption software that now comes standard on most major Linux distributions. It is the Free Software version of the popular PGP (Pretty Good Privacy) personal encryption program developed by Phil Zimmermann in the early 1990's (and for which he was the object of a US Government investigation for a few years!). I bring up the government investigation because it was assumed at one point that encryption would only be used to hide criminal activity from the authorities. In truth, you can't really impede unscrupulous people from using technology for bad purposes, but that doesn't mean you should stop law-abiding citizens from using it for good purposes. GnuPG and PGP can be used to encrypt files for secure communication but it is also a tool that is being used more and more to establish the authenticity of a company's or an individual's work. A package is normally "signed" with a public key generated by either of these two programs. If you were interested in checking the signed package, you would first get the public key from the developer's website. These are normally plain-text files with the extension *.asc. Then you would import the key into your "keyring".
gpg -import acme.asc

Then you would download the package and its signature. This signature usually has the same name as the package but also with an *.asc extension. Now you can verify the package:
gpg -verify acme_cool_app.tar.gz.asc

You should get a message about when the signature was created, by whom and whether it is good or not. If its good, then you should feel fairly confident that you're dealing with an authentic package.

RPMs
RPM files come with their own built-in mechanism for verifying packages. As with the above example, you should get the developer's public key and import it. The most recent version of the RPM system uses its own process of importing the key. Check the documentation on your system to see what version you're using and how to do it. You can then check the integrity of a downloaded RPM in this way:
rpm -checksig acme_app.0.02.rpm

You will get a message like this:
acme_app.0.02.rpm: md5 gpg OK

As you can see, it's now not enough just to keep track of security alerts and download and install the updated packages. You should now take extra measures to assure that the packages you've downloaded are the "real deal" and haven't been tampered with in any way.

Bug Fixes
Computer programs contain bugs. This is, at present, an inescapable fact of life. Some of these bugs can be exploited for less than noble purposes and then they become security issues. Some are just silly little things (developer forgot to make menu item 3 do something). Other bugs may cause the program to crash at inopportune times and result in data loss. Regardless of the severity of the bug, you will need to update programs from time to time because of either harmless or extremely evil and annoying bugs. You should follow the same procedures above to verify the authenticity of the packages.

Installing New Versions
Developers normally release new versions of their software. Change is really the name of the game of software development. The changes that concern us here are not really those that are mentioned above. If there is a bug fix or a security issue, it's imperative that you install a new version. Although a new version may be released because of one of these issues, what we'll consider here is the installation of a new version that has been released to offer users new features.

Major Updates
Sometimes a company or individual developer releases a new version that contains major changes to the program. It could be a total re-write of the application in question. It may be the addition of multiple new features. If you're running web server in a production environment (a public server that is vital to your company's revenue stream, for example), you sometimes need to make some hard decisions about updating to a major version change. The update might "break" existing scripts. If you don't have a development or test server to try out the new version, you might be playing with fire if you just go ahead with the update on your public server. It's always best to ease the changes in. Try them out first in a development environment. Create a mirror of your production environment on a different server and observe any anomalies. You may find that the major version isn't worth installing. Recently, for example, many organizations running the Apache webserver in version 1.3.x have considered it unnecessary to update to the latest major version change, version 2.0.

Simple Programs
You've heard that a fairly small program that you can't live without has been updated. In the Linux world you can be fairly sure that an update isn't going to break anything major. Most Linux programs aren't created under a strict profit incentive system, so there's no reason for the developer not to provide backwards compatibility. My relationship with Windows 95 was soured very quickly when I saw that I couldn't open up my Word for Windows 2 files in Windows 95's WordPad. I have yet to have an experience like this in Linux. Of course, some programs dynamically link to new libraries. If you don't have these libraries installed on the system, you will normally be unable to run the new version of the program. Both the RPM system and Debian's apt-get package system will check dependencies for you before you install. You may find that you'll have to update some libraries.

Libraries
Without going into a lot of detail, a library is a piece of code that provides your program with something it will use over and over again. The buttons that GUI programs commonly use are rendered using libraries. This is just one example but there are literally thousands of different types of libraries that your Linux system could take advantage of. There are also two basic types of libraries. There are those that are compiled right into the program. These are called statically linked libraries. You will seldom have a problem installing programs with these kind of libraries because the developer put the libraries that the program needs right into the binary (or executable file). The other type of library is the dynamically linked or shared library. This means that you have a program that depends on the existence of a certain library to run. When you start the program, it looks for that library to provide functionality that it needs. For example, if you download the Opera Internet browser, on their download page you will see that they offer two types of files.

As the browser depends on the QT libraries for its GUI, Opera provides a file with these libraries statically linked and another one without the libraries. In the latter case, the executable will look for the QT libraries that are already on your system. The advantage of the dynamic or shared library is the size of the executable file. The major disadvantage, however, is that you may not have that library installed, or worse yet, that you may have an older version of the library installed. I say worse because you may find that a number of programs need the older version of the library to run. If you updated your libraries, you would invariably ``break'' these programs. Actually, if you use standard tools like RPM or apt-get, you would be told explicitly that the new libraries conflict with dependencies of other programs. This is the nasty and dreaded dependency conflict. Here you're faced with two options: update the older programs too or forget about the new version of the other program. That's not what management gurus would call awin-win situation. These are, of course, value judgments that you have to make, along with appropriate guidance from the people who pay the bills!

Server applications
As alluded to earlier with Apache, your organization may decide that it's time to update the web server, mail server or any other server software the your machines may be running. Again, as we mentioned previously, it might be a thorny issue. Much like the situation with the dynamic or shared libraries, some servers also depend on secondary modules to help out with the work they're doing. Such is the case with Apache, which employs modules to provide features for delivering web content. Two common modules used by Apache to provide for interactive web pages are mod_perl and mod_php. These allow Apache to deliver content using Perl and PHP scripts respectively. Perl, at the same time, is a programming language that also has its own modules. As a recent case I was involved with shows, you may make a decision to update Perl modules (or remove some) and find that Perl scripts on your server ``break''. That's not good in a production environment.

The Linux Kernel
The mother of all parts of the Linux operating system is the Linux kernel. That is what Linux really is. That is what Linus Torvalds started working on in 1991 and that's what eventually has turned into the base of what the whole Linux world is about. At the time of this writing, kernel 2.4 is the most recent major stable version of the kernel and development on version 2.6 (called version 2.5 as it is still not ``stable'') is quite advanced. 2.6 is reportedly right around the corner, so that always brings up the question: Should I update to the new kernel?.

Hardware considerations
This is mostly taken care of. The Linux kernel is all about supporting hardware, as any kernel is. A new kernel brings things along from past versions so you should have no problem on the hardware side. People normally run into issues with bleeding edge hardware. As we're not talking about backwards compatibility, it's not an update problem. But just as anecdotal evidence that you may run into problems, I noticed that when I switched from the 2.2 to 2.4 kernel, the driver for a common network interface card, RealTek 3189, was changed in the 2.4 kernel. This driver, in my opinion, was worse than the old one and did cause some problems at first. Normally, however, you shouldn't run into hardware support problems on existing equipment when updating to the newest version of the kernel.

Software considerations
The two major reasons for updating kernel versions is to get support for new hardware and to take advantage enhanced features in the kernel for running programs. Some of these enhanced features may necessitate a move to new software for some tasks. A major change in the latest stable kernel that comes to mind is the greatly improved network filtering capabilities. I remember that this prompted me to move to using netfilter or iptables for developing firewalls. Previously, the standard firewall method was using ipchains and iptables provided a distinct improvement over it.

There are all kinds of things to keep in mind when you compile a new kernel. The issues are as varied as the types of machines you might run into out there. We'll go into more detail on kernel issues in our section on compiling the Linux Kernel.

Webservers

Although the Internet existed decades before it became popular with the public, this popularity is mainly due to the invention of the World Wide Web. The pages that make up the WWW are all served from machines running a type of software that has become known as a webserver.

Apache webserver
The most popular web server by far is the Apache web server. It originated as a set of patches to provide functionality to the original httpd web server (the name Apache comes from "a patchy webserver"). It is released under its own open source license (called, unsurprisingly the Apache license) and it is available for a free download and comes with most major Linux distributions. The combination of Linux and the Apache webserver account for over 60 percent of the servers on the Internet.

Most major Linux distributions come with Apache and they offer you the possibility to install it. What's even better is that now most distributions will even configure Apache during the install process to work together with other complementary web development packages that you may have chosen to install as well. These might include PHP, mod_perl and mod_python. These advances in the ease of install are surely welcome. I remember installing by Apache from a tarball in the early days of my Linux experience and it was somewhat time consuming to get Apache to play well with all of these add-ons. This should not be an issue anymore. You can, of course, install from a tarball and get some really personalized configurations - but that goes way beyond the scope of this course. Although I normally don't like to use the expression way beyond the scope of ..., it is a fact that entire books are dedicated to Apache alone. What we will do is deal with ways to take advantage of some of Apache's features that you can get "out of the box".

httpd.conf
The main configuration file for the Apache webserver can be found, normally, in /etc/httpd or /etc/apache - depending on where your distribution chooses to place it. As I mentioned before, most distributions do a pretty good job of configuring a working web server, but you may want to change some things so Apache works more to your liking. Before making any changes though, I recommend making a copy of httpd.conf. It's a fairly large file and it's easy to make some change and then lose track of what you did. Then, if you find Apache's not working right, you can always go back to the original file. I usually do something like:
cp httpd.conf httpd.conf.YYYYMMDD

Where YYYYMMDD is the year, month and day. You are, of course, free to call it httpd.conf.charlie if you choose. This is really a good policy to follow when you change any config file, especially if you're dealing with services that are crucial to a company or organization. You can quickly get back to a working server and then figure out what went wrong later. Let's look at some things you can do to get Apache working to suit your needs.

Some basic security Apache is designed so that every directory where you have created web content should have an index file. This is normally index.html, but you may also add other extensions, such as index.php, index.htm or others. The part of httpd.conf that determines this is:

#
# DirectoryIndex: Name of the file or files to use as a pre-written HTML
# directory index.  Separate multiple entries with spaces.
#

  DirectoryIndex index.html index.php3 index.php index.htm index.shtml index.cgi


Apache, by default, is going to show us the directory listing if we don't have one of these files in a directory. That's probably not a good idea from a security standpoint. We all get lazy and we may place temporary files in a webserver that we don't mean for the world to see. The best thing is to nip this problem in the bud and keep Apache from showing directory listings. You need to find this line in httpd.conf:
Options Indexes Includes FollowSymLinks MultiViews

It's a good idea to remove the Indexes option here. This will prevent a website visitor from seeing what's in the individual directories.

Document root and cgi-bin
The document root means the directory where Apache serves the web pages from by default. You will see a line like this in your httpd.conf:
#
# This should be changed to whatever you set DocumentRoot to.
#


You'll find that the Apache developers are good at explaining what things mean. That is, if you prefer your web pages to be in another place, you should change it here. Even if you want them in another place, you may not want to change this right away. Further along, I'll explain the concept of "virtual" websites, which means "hosting" more than one website. However, if you're only going to be serving one set of pages, you may change this to wherever you want. You may also want to have a look at this line:
ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/

This is the directory where you can place your cgi-bin scripts. Those of us who have some web development experience will know what a cgi-bin script is. In case you don't, it's a program that's mean to be run from a form on a web page.

Your script is placed in the cgi-bin directory and Apache knows where to find it when the form calls it. If you change the above line in Apache to have the scripts located someplace else, you also need to change a line a little farther below:
    AllowOverride None
    Options ExecCGI
    Order allow,deny
    Allow from all


Again, as I mentioned above, you may not even need to make these changes if you're going to be maintaining several websites on the same server. More on that further ahead.

Personal user sites If you give somebody an account on the machine running Linux and Apache this person has the ability to run his/her own personal website. I'm sure many of you have seen sites like: http://www.domain.com/~larry/ . This is because the UserDir module is activated in httpd.conf:
LoadModule userdir_module /usr/lib/apache/1.3/mod_userdir.so

And farther down you will find this section:
#
# UserDir: The name of the directory which is appended onto a user's home
# directory if a ~user request is received.
#

    UserDir public_html


By default, Apache designates the directory where the public webfiles (and remember, these are public!) are found to be public_html. There's no reason why you can't change this name to website or any other meaningful name. You could even comment these lines out if you don't want the users on your system to have a personal website. If you do allow this, you may want to skip down to the next line:
#
# Control access to UserDir directories.  The following is an example
# for a site where these directories are restricted to read-only.
#


There are some options here as well as to how the site will work. You should remove the option Indexesfrom here as well, as we did earlier.

Alias directives
Some applications that run under Linux use the Apache webserver to display some of its content. There are systems to display man pages in the browser. Some Linux distributions use Apache to give you a web-based help system and documentation. They will place their documents outside of the root webserver directory. To access this "outside" content, we need to create "Alias" lines in httpd.conf or else it will be inaccessible from a web browser. In the following example, I'll show you what I need to add to httpd.conf so that visitors could see my mailman mailing list public archives.
I found the following line in httpd.conf:
#
# Aliases: Add here as many aliases as you need (with no limit). The format is
# Alias fakename realname

Then I added these lines:
# Aliases for mailman
Alias /pipermail/ /var/lib/mailman/archives/public/
Alias /images/ /usr/share/doc/mailman/images/

This means that a person only has to type http://www.mydomain.ork/pipermail/ into a browser to see the mailing lists located in /var/lib/mailman/archives/public/. If there are any images on the page, they will also be displayed.

As you can see, Apache is very versatile - allowing us to configure it to use web content from third-party applications with relative ease.

The .htaccess file To help with website administration, Apache adds an additional configuration file, called .htaccess (yes, with a dot (.) in front of it) where you can add more options that effect how your website works.

No more 404s
As a web surfer, nothing annoys me more than a "404 not found" page. This is what Apache will show you by default when you request a page that has disappeared.

404 is the Apache code for a request for a page that does not exist. Web-savvy people now refer to a missing page as a "404".

Not Found: 
The requested URL /bla.html was not found on this server.
Apache/1.3.26 Server at www.dominio.ork Port 80

As it's frustrating as a user to find this page, it's my job as a webmaster to make sure it doesn't appear. There is really no excuse for this occurring. The .htaccess provides a means to redirect users to content if you've moved it. Let's say you have a site that talks about a club you have set up. You have a page dedicated to your August 2002 barbecue. You've created a directory called /bbq. The club is successful and another year goes by and you have another barbecue - this time in August 2003. You decide to make the website more manageable and so you create two directories - bbq02 and bbq03 with pages about the festivities. Now, a problem arises. People might have bookmarked the page dedicated to the hilarious food fight at the 2002 shindig: http://www.ourclub.ork/bbq/foodfight.html. Now, of course, you've moved it. I would say that it's your duty as a good webmaster to provide a re-direct. Since /bbq no longer exists, we can create an .htaccess file in our webserver root directory and add the following entry.

# redirects
RedirectPermanent /bbq/foodfight.html http://www.ourclub.ork/bbq02/foodfight.html

You should add any and all web pages that you've moved to /bbq02 to your .htaccess file as well.

Friendly greetings
If you've done your work diligently in providing re-directs for moved pages, then you can be fairly confident that any 404s that are generated in your web logs are probably the result of things beyond your control. Users will often type bad URLs into their browsers and other webmasters may make mistakes providing a link to one of your pages. In these cases, it's probably a good idea to provide and alternative web page to replace Apache's standard 404 warning. Again, .htaccess provides you with this possibility. How elaborate a substitute page you provide depends on you and your imagination (and perhaps your good taste!)

It's a good idea to use grep to look for 404s in your Apache access logs at least once a week or so. You may have re-directed users to other pages but you may have overlooked the fact that people may have bookmarked specific images as well. Apart from the ease-of-use issues, it is also a basic security measure. You may find one IP address generating a lot of 404s. This could be an individual checking out your site as a prelude to a defacing or other attack on your website. You may then want to take steps such as firewalling this IP from your network or, if the situation warrants, contacting the owner of the netblock.

First, as a website administrator, it's probably a good idea to create a directory for administrative needs. Call it what you like - something meaningful to you. Now you can create an alternative page for your 404s and place it in this directory. The page normally has a simply greeting- maybe something like: Oops! We can't find that. and maybe a link back to your home page. If you have search capabilities on the site, you may want to link to those. Again, it is up to you as a web administrator to create something that works for you and your site.

Password protection Apache also provides a means of keeping people out of certain directories. Again, this depends on some lines placed in .htaccess. Let's go back to your club's website. You may want to create a members-only section to the website that's restricted to those to whom you've given a password. To do this, you would first create the directory and then create an .htaccess file in the directory. Then add the following lines:

AuthUserFile /home/club/.htpasswd
AuthGroupFile /dev/null
AuthName "Our Club - Members Only"
AuthType Basic

Require valid-user


Now you must create the file with the users and passwords in it, called .htpasswd. You will notice that we have placed it outside of the web directories as a security precaution. Apache can read it just fine there and there is no risk of it being read by a nasty spider. Here's how you create the .htpasswd file:
htpasswd -c /home/club/.htpasswd joe

Where joe is the first user in the file. That's important because the -c option creates the file. From now on, for every user you want to add, you don't use the -c option. Apache will ask you for the password twice, as is standard in Unix-type applications. Now, when you go to http://www.ourclub.ork/members/secret.html you will get this in your web browser:

Scripts in alternative locations Another feature we can get via .htaccess is the ability to use scripts outside our cgi-bin directory. This is another good way to increase the manageability of your website. Let's say you have a section of your website for news about your club . You have it in a directory appropriately called /news. You may have a small Perl script that takes news items out of a MySQL database. You could create a directory in /news called /script and then create an .htaccess file with the following lines in it:
Options +ExecCGI
AddHandler cgi-script .cgi

Now, any script with the .cgi (dot-cgi) extension can be executed as a script. Normally Apache wouldn't allow that but these two lines will override that behavior. Of course, there is a good reason for this not being provided by default. It is a potential security risk. Most websites place their cgi-bin directory outside of the web directory - and for good reason. Any script can be executed from it. It's much more difficult for someone to get at the cgi-bin directory if it's in some other place. But if we place it inside a website's content directories, the possibility of someone manipulating it increases. If you do choose to use this feature, make sure that the scripts are well-written and free from exploitable bugs, such as cross-site scripting vulnerabilities and that few people - the fewer the better -have upload privileges.

robots.txt
Search engines like Google exist because the are able to make inventories of websites. Yahoo started out with a few individuals creating a directory of the limited number of pages that existed in the early 1990's. At the time of this writing, there are literally billions of pages now on the WWW, so it would be too costly to have humans to this manually. What Google and other search engines employ are automated robots. But you as a website maintainer may not want parts of your site to be inventoried by search engines - or you may not even want your site inventoried at all. To make sure that your wishes are respected, popular search engines will have their robots read a file called 'robots.txt' that is placed in the root directory of every website. robots.txt contains instructions for web crawlers, spiders and robots as to which directories are off limits A robots.txt file that does not allow any prying robot eyes will look like this:
User-agent: *
Disallow: /

The asterisk means any user agent. And the slash / means the root directory and anything in it, which includes subdirectories. In other words, the whole site is off limits to any robot. This is a bit strict. This would definitely not do for a website maintainer who was looking to increase search engine ranking. You probably want to be a bit more lenient:
User-agent: *
Disallow: /admin
Disallow: /reports

This would allow robots to make an inventory of your site except for the two directories /admin and /reports, which you have chosen to restrict their access to.

You can also specify the type of robots you want kept off the site by naming them specifically after User-agent: . You can even have several sections to your robots.txt file for different circumstances.
User-agent: webcrawler
Disallow: /managers
Disallow: /docs

User-agent: lycos
Disallow: /managers
Disallow: /docs
Disallow: /how-to

User-agent: evilrobot
Disallow: /

User-agent: *
Disallow: /managers

What you exclude is up to you (or your organization's policy making body).

System Administration - An Overview


All the hoopla

As mentioned before, some people want you to believe that administering a Linux system is like arranging a peace settlement in the Middle East or understanding a Jackson Pollock painting. Some years ago when Linux was really a hobbyist's system it was considered difficult. Now Linux has gone mainstream and there's nothing taxing about running a Linux system. You do not have to be a computer "guru" to use it. Anybody can be a Linux "administrator".

Years back, if you had the title 'system administrator' it was comparable to the role of the arch-angel Michael. (Michael comes from the Hebrew words meaning He who is like God). That usually meant your own parking space at the company and a seat at the executive dining room. There are some system administrators for large corporations that run mainframes who enjoy these privileges (and rightly so). However, if you've successfully installed the Linux operating system then you too can now proudly wear the badge of 'system administrator'. It doesn't matter if you're setting up one computer running Linux or a bunch of computers in your small business or a computer room in your local community center or school, you've now signed on to become the 'big cheese'.


The role of root

Using the 'root' account is like being God on a Linux system. (Hence, my earlier reference to the archangel Michael). That means that you want to be extremely careful when working as root. With something as simple as a wrong keystroke you could do a great deal of damage. Before you actually sit down and work as root for the first time, I would recommend going into the file known as .bashrc (if you're using the bash shell, which is the most popular one) and adding a few aliases in there. An alias is nothing but an entry in that file that says that a certain command that you type can perform additional actions above and beyond its default behavior. For example, if I type:
rm unwanted.doc
in a terminal, unwanted.doc is going to byte heaven (or hell).
rm

is for removing or deleting files. There is no undelete that is practical and easy when you're using a shell, so If you didn't want to delete that, you're pretty much out of luck. But if I add an entry in my .bashrc file like this one:
alias rm='rm -i'
it makes sure that I get asked before the file actually gets deleted. You may want to do the same with other potentially dangerous commands.

alias cp='cp -iv'
makes the copy command interactive (i) and verbose (v) so it will both ask you and then tell you what you did.

alias mv='mv -i'
also makes the move command (used for moving and renaming files) interactive.

There are people who say that adding these to root's .bashrc is something that 'wussies' do. I always ask them: If a sailor tied a cable on to himself/herself before he/she went out on deck in thirty foot seas to fix something, would that be considered a 'wussie' move? Making a mistake is comparable to encountering a rogue wave on a calm sea. There really isn't anything comparable to being in rough seas sitting in front of your computer, but just as dangerous rogue waves have been known to appear on calm, sunny days and sink boats, silly mistakes have ruined projects. Better to keep a buffer zone between you and your mistakes.


Delegating authority

Back to the nautical motif for a moment; just as one ship generally doesn't have two captains, it is rare that a small organization would have two systems administrators. There's usually not too much benefit in delegating authority in this setting. People are prone to making mistakes. Even a seasoned systems administrator has sometimes deleted files that he/she shouldn't have or messes up the configuration of something. If two heads think better than one, then four hands also might make more mistakes than two.


The use of sudo as an alternative

If you're the head systems administrator (or the only one) you can "deputize" your co-workers by installing and configuring the program sudo. In Unix/Linux speak, the term 'su' means superuser - that is, root. Only root has true administration rights and privileges, so this program allows others to "do" su, hence the name, sudo. Ok, Sheriff, time to fight the bad guys. Let's see what your deputies can do.

su can also stand for switch user. For example, if you had two accounts on a machine - let's say bob and harry - you could log on as 'bob' and do: su harry and then work as harry.

Your distribution should have this very popular program among its packages. If it doesn't, you can go to:http://www.courtesan.com/sudo and get Todd Miller's great application. After you've installed it, you have to create what's called a sudoers file. You do this by typing:
visudo
as root. This is essentially a special version of the text editor vi just for creating and editing the sudoers file.

Basic Vi commands
  • ESC+i = insert text
  • ESC+wq = save and quit
  • ESC+u = undo


Here is an example sudoers file I have for my home network. It is not really as complicated as most are, but it gives a good basic idea of what you need to do to let other users help you out with some administration tasks.
#
# This file MUST be edited with the 'visudo' command as root.
#
# See the sudoers man page for the details on how to write a sudoers file.
#

# Host alias specification

# User alias specification

User_Alias TRUSTED = mike, maria

# Cmnd alias specification

Cmnd_Alias INTERNET = /usr/sbin/traceroute, /usr/sbin/ntpdate
Cmnd_Alias KILL = /bin/kill, /usr/bin/killall
Cmnd_Alias TOOLS = /bin/mount, /bin/umount

# User privilege specification
root    ALL=(ALL) ALL
TRUSTED ALL=INTERNET, KILL, TOOLS

Let's break this down. First of all, we add the line
User_Alias TRUSTED = mike, maria
That means that the users mike and maria become the "trusted" users. And what are they trusted with? Jump down to the last line for a second. They are trusted with commands in the group INTERNET, KILL and TOOLS. What are those commands? Jump back up to the section
#Cmnd alias
specification
These trusted users can use ntpdate, for example, to keep the computer's time correct. More information on that command later. (One of your duties as system administrator will be to make sure your machines keep accurate time and display the correct date. ntp is probably the best package to use to do this.)

I've created a KILL group (sounds like Mafia hit men!) so other users can kill runaway process that can only be shut off by root normally. Some server process may have a problem and you might have to shut down that process. Starting it again is something that's not included here however. It might be best for these deputized users call the "real" system administrator and if that's you, for example, you may want to check out the configuration files for certain servers before you start them again. You may have to mount floppies or other partitions to get data from them, and that's where the TOOLS section comes in handy.
When the user executes a command that's authorized in the sudoers file, he/she first needs to type
sudo
and the command. For example, if you wanted to update the machines clock to the exact time, you would type:
sudo ntpdate atimeserver.nearyou.gov/edu

Then you need to type your user password. If you make a mistake, sudo will hurl insults at you (no kidding). Todd Miller has a good sense of humor and the results of botching a keystroke are sometimes hilarious!

You can add more commands and users to your own sudoers file. Whatever you think is prudent in your situation. There is some possibility for abuse. Use your best judgment.


Taking care when working as root

As I mentioned, there's a chance of doing some damage when you work as root. There are other ways to protect yourself besides putting aliases in your .bashrc file. One of them is using the program su.

su lets you work as root when you're logged in as another user. Good practice dictates that you disallow root logins from remote machines, so if you're performing administration tasks remotely, su is irreplaceable. The remote machine lets you log in as user fred, for example, and then you just type:
su
and type the root password. For all intents and purposes you've got a root terminal open now. That means that you can do anything - just as if you had logged in as root in the first place. You're really running the same risks by working as root, but you've at least eliminated the risk of logging in as root. That's quite important from a security standpoint. The real advantage to using su is the possibility to carry out individual commands. Let's say you've downloaded an application and it's only available as source code in a tarball. You can download the source code to your user directory and as a user you can run the configure and make scripts provided with most tarballs. That way, you minimize the risks of compiling the software. If you're satisfied with the binary file (the application itself), then you can use 'su' to install the application and its accompanying documentation in the usual application and documentation directories. Here's what you'd do:
su -c "./make install"
You need to add the -c (command) switch and put the actual command in quotes. Then type the root password.

As you see, you can run any command that root can. That, of course, means that you need to know the root password, so this is something that only the system administrator should do. It's not a good idea to be giving your password to other users so they can run commands. They can do that with sudo, as we mentioned earlier.
You can also use su to do things as other users. You're not just restricted to doing things as root. This might come in handy if you've got two accounts on a machine. Instead of having to log in as the other user, you could, for example, read a file that you've restricted permissions for. Let's say you've got a file called my_ideas and you've removed read permissions to it for your group and for others. Instead of logging in, you can type:
su fdavis -c "less /home/fdavis/my_ideas"
and you will now be asked for fdavis' password. Now you can access files from your other user account without having to log in. Other users may also give you their passwords to access things in their account. I question the wisdom of doing that, but in an emergency it's an option. A co-worker who's ill may give you his/her password to access important files. That's acceptable in that kind of situation it seems, but the ill worker should promptly change his/her password as soon as he/she's back at work.

As you can see, if used correctly, su is a great tool for getting administration tasks done safely. It's also useful as a time and trouble saver. Just make sure you don't abuse it.

Mail Servers

The two most widely used protocols on the Internet are http, hypertext transfer protocol (ie. the WWW) and smtp, simple mail transfer protocol (ie. email). We've just dealt with serving web content with Apache. Now we'll deal with managing an email system.

The bulk of the behind-the-scenes email tasks are carried out by the MTA or mail transfer agent. This is the primary software working on a machine set up to be an email server. Your email client software (Evolution, Kmail, Mozilla's mail client, etc.) will send your email message to the MTA which will then send it out into the Internet and to its intended recipient. So, effectively, the MTA transfers your email message to another MTA, the one that handles the account of its recipient, which then stores it in a file known as a mailbox or a mail spool. Your recipient's client software will then request messages from the his/her server and the mail spool's contents will be transferred to his/her client's mailbox. That is essentially how email works in a nutshell. There can be other programs mixed in there as well. I personally use a program called fetchmail, which is a mail retrieval agent which just "picks up" my email and passes it through a program called procmail which filters it according to some rules I have created. Procmail places it in mailboxes and then I read it with a mail client called mutt. Then I compose mail with emacs and send it to Mutt which then sends it to the MTA. My system is really not the norm. The typical user will go through a simpler route - perhaps something like this:

Evolution --> MTA (sender) --> MTA (recipient) <-- Evolution.

As you can see, the objective is to keep this simple, so the burden is on you as the system administrator to make sure that it is. That, however, is not going to be simple for you. At the time of this writing, email is a battlefield filled with land-mines. Spam is the principal problem, but there are others. It's your job to make sure that the people in your organization can safely assume that their email is going to arrive at its destination. Sometimes that's easier said than done.

Postfix MTA
Postfix is a mail transfer agent created by Wietse Venema for IBM. Its principal virtues are that it's easy to administer and its pretty secure. I have also found that it plays very, very nicely with other complementary programs and you can use it to set up pretty elaborate email systems. I have used it to set up interesting email schemes for a few companies so they can offer email accounts to a large number of users without the need to have user accounts on the machine. You can set up Postfix to work with MySQL to keep track of the users and with Courier-IMAP to provide authentication for mail pickup. Though the configuration of the system like this is not difficult, it is not trivial either so we won't be dealing with it here. Information on how to set up a system like this is freely available on via how-tos on the Internet, if you're interested.

Why not sendmail?
sendmail is the oldest and most widely-used mail transfer agent. However, in the author's opinion it suffers from two flaws. One is that it is difficult to administer. The cryptic sendmail configuration file is just not a good place to be for the budding system administrator to be hanging out. Secondly, exploitable bugs are frequently found in it. That's why I've chosen to use Postfix as my mail transfer agent of choice. Postfix was developed around 1998, so Wietse Venema already had a pretty good idea about what nasty stuff could be done to a mail server. This is not to disparage the efforts of sendmail's developers. Being the most widely used MTA says a lot about it, or, as Spanish speakers say: quien tuvo retuvo. In the end, I think our teaching/learning interests are best served with Postfix.

Installation of Postfix
Most major Linux distributions offer the possibility to install Postfix, though in my experience, it is not the one installed by default. Debian prefers to install software that's licensed under the GPL. Postfix is not (it carries the IBM Public License), so you need to tell Debian's dselect installer that you want it instead of Exim, which is Debian's preferred MTA. RedHat will install Sendmail by default. Again, you need to adjust this either at install time or by removing Sendmail and substituting it with Postfix.

Postfix configuration files
Postfix places its configuration files in /etc/postfix. The main configuration file is appropriately namedmain.cf. You will be dealing primarily with this file to make changes in your Postfix configuration. What's nice about Posfix is that this configuration file is not particularly difficult to comprehend. The options are pretty straightforward. Here is a sample main.cf
# Do not change these directory settings - they are critical to Postfix
# operation.
command_directory = /usr/sbin
daemon_directory = /usr/lib/postfix
program_directory = /usr/lib/postfix

smtpd_banner = $myhostname ESMTP $mail_name (Debian/GNU)
setgid_group = postdrop
biff = no

# appending .domain is the MUA's job.
append_dot_mydomain = no
mydomain = domain.ork
myhostname = mail.domain.ork
alias_maps = hash:/etc/aliases
myorigin = /etc/mailname
mydestination = $myhostname, $mydomain, ,localhost
relayhost =
mynetworks = 192.168.0.0/16, 127.0.0.0/8
#mailbox_command = procmail -a "$EXTENSION"
mailbox_size_limit = 0
recipient_delimiter = +

As you can see, the first part tells you not to change the directory settings for where the program is. That's a good idea! Next you'll see a setting known as smtpd_banner. This is a mailserver's way of identifying itself to the outside world. Ours looks like this:
smtpd_banner = $myhostname ESMTP $mail_name (Debian/GNU)

Here you'll see that we tell them who we are ($hostname). This is mandatory according to the SMTP specification. $mail_name is the name of our MTA (Postfix). Though Postfix is not configured to to this by default, we could add $mail_version between $mail_name and (Debian/GNU) to announce our Postfix version to the world.

Postfix has its own user group, named postdrop, which you can see in the following line. After this, we have a line that we can change if you want postfix to pass notification to a 'biff' program that notifies when there is new mail. Let's go to the next section.

There's a lot of brain-dead mailing software out there. Sometime it doesn't bother to check if you've put a proper From: address in your mail. There is also a lot of software that purposely doesn't do it. It's mostly used by people who don't want you to know who the mail's from. Yes, you guessed it - spammers! Hence, the comment at the beginning of this line. If you want to add your own domain to mail of this kind coming and going, just change the 'no' to 'yes'.
# appending .domain is the MUA's job.
append_dot_mydomain = no

The next lines are pretty self-explanatory:
mydomain = domain.ork
myhostname = mail.domain.ork

The next line:
myorigin = /etc/mailname
is the contents of the file it mentions. This is normally the same as your hostname minus the host itself, or in our case, domain.ork.

The next lines lets us handle mail for more domains. If you have to do that you can add them here:
mydestination = $myhostname, $mydomain, trinkets.bis, ,localhost

The next line:
relayhost =
gives you the possibility of using the mail server on a machine for the sole purpose of sending mail to another mail server. The other server is the one that really sends it out into the Internet. This might be used in large organizations where there may be divisions of the company that work in remote locations with different types of Internet connection. There are cases where workers are accessing Internet from some type of broadband connection. If the mail were to go out directly, new anti-spamming techniques might tag this mail as spam (and often do!). This way, a branch office of an organization might configure Postfix to have a relay host which is at the main office with a permanent high-speed Internet connection. So you could use a hostname or IP address here as well.
relayhost = mail.parentco.con

The 'mynetworks' setting is one of the most important. This is a list of IPs or hostnames that are allowed to use your mail server. In this day and age where spammers could put out signs inspired by McDonalds, bragging about 'billions served', then you've got to be extremely careful about this. If you're not an ISP, the setting below should be the only one you use, namely, your local network and the machine itself.

You may also have a local network of 10.0.0.0/8 or 172.16.0.0/12. I just use the 192.168.0.0/16 out of habit.
mynetworks = 192.168.0.0/16, 127.0.0.0/8

In certain, very special circumstances you could add an IP or a host name here, but this would have to be a fixed IP or hostname of a trusted person.

The potential for abuse of mail servers is enormous. I spoke to a system administrator who performed an experiment where he modified a script formmail.pl that once contained a flaw which is used a lot by spammers. The script he prepared appeared to be the flawed script but it actually never send any mails through it - it just logged the IP of spammers. Within 20 minutes of putting the script on a server, the first spammer tried it. By the end of the day, one spammer had tried to send over 200,000 messages through it.

You can uncomment this line if you want the Postfix to pass received mail automatically through procmail filters before it is delivered.
#mailbox_command = procmail -a "$EXTENSION"

We can also put a size limit on the our users' mailboxes. If you want to restrict your users to mailboxes of say, 200 MB, where disk space isn't a problem, you would use:
mailbox_size_limit = 200000000

(Sizes are in bytes). You could also add a line to restrict the size of mails that are allowed to be sent. This would restrict them to a 3MB limit.
message_size_limit = 3000000

The last line in our example:
recipient_delimiter = +
is there to resolve some issues with mailing list software and is best left alone unless you run into problems with something that doesn't like it.
That is a bare-bones example that will do all right on a simple mail server. You may want to tweak it a bit so that instead of just an "all right" job - it does a damn fine one!

Let's look into some additional options.

Anti-virus and anti-spam measures
Viruses and spam are killing the "killer app" of the Internet, which is email. Individuals, business and organizations rely on it for communication but that reliability is fading fast. As a system administrator, you can do a lot in order to see that a good deal of spam and most viruses never reach any user. Here we are going to see a few examples of methods to block spam and viruses at the server level.

Blocking viruses at the server
I can't find a compelling reason why people should be sending email with potentially dangerous attachments and I personally think it's a good policy not to allow these mails to enter your server. After all, your server is either your property or it is the property of your employer and you're within your rights to restrict access to it. Postfix will perform a check on the body of the email message and it can either reject or quarantine mails with certain files attached to them. First, you need to have the Perl compatible regular expression (PCRE) package installed that works with Postfix. Postfix will use this package to parse the email messages for these nasty gifts that are sometimes included in them.

First, we need to create a file with the expressions we are to look for. The file is normally called body_checks, so we know what we're doing with it. It will include lines like this:
/^(.*)name="(.*).vbs"$/   REJECT Mail contains banned attachment
/^(.*)name="(.*).pif"$/   REJECT Mail contains banned attachment
/^(.*)name="(.*).scr"$/   REJECT Mail contains banned attachment
/^(.*)name="(.*).exe"$/   REJECT Mail contains banned attachment
/^(.*)name="(.*).com"$/   REJECT Mail contains banned attachment
/^(.*)name="(.*).lnk"$/   REJECT Mail contains banned attachment
/^(.*)name="(.*).dot"$/   REJECT Mail contains banned attachment

These are particularly notorious attachments. You may able to think up some of your own. What we have done here is reject these and send them back to where they came from with our reason for rejecting them. There's no need to get particularly verbose here. We've made our point. You may also skip the message and just silently reject it.

If your company has a particular need to be receiving certain attachments that might be considered potential hazards, like *.zip files which after being unzipped can cause harm, you can also quarantine mails. Here's a sample rule:
/^(.*)name="(.*).zip"$/   HOLD

The HOLD option will place mails in a security queue and you can deal with them later. You can then look at the hold queue with the postcat. Mail can then be deleted or passed on to its intended recipient with postsuper.

You can save your body_checks file and place it in /etc/postfix. Now you need to open the main.cf file and add the following line:
body_checks = pcre:/etc/postfix/body_checks

You can keep track of developments and add more rules as you see fit.


Anti-spam measures with Postfix
As mentioned before, the flood of Unsolicited Commercial Email, or "spam" is slowly killing the communication value of email. Until a replacement is found, all that we can do is use sandbags against the torrent. Luckily, it's fairly easy to implement anti-spam measures with Postfix.

In the past, the use of poorly configured mail servers was the main way of moving spam around the Internet. That is becoming less and less of an option for spammers. Administrators are getting smarter and those open relays that are still left are quickly blacklisted. Spammers have essentially moved on to other forms, mostly illegal, to peddle their Viagra and male member enlargement schemes. It's still a good idea, though, to have Postfix query a spam blacklist server before it picks up the mail.

Spam blacklists are a bit like religious denominations. There are fairly tolerant ones and then there are fire and brimstone fundamentalists. At times, legitimate domains have ended up on blacklists because of misconfiguration or even false reports. I have seen cases where people have forged IP addresses in mail headers to get a particular domain on a spam blacklist. Your tolerant blacklist maintainer will listen to reason. Then there are those who are run by the modern-day equivalent of Jonathan Edwards, who wrote Sinners in the Hands of an Angry God. A well-run open relay database will overlook mistakes and forgeries and get you quickly off their list. You should probably do some looking around on Usenet via Google to see if people are talking well or ill of a particular spam blacklist. I would avoid any that has the term 'Nazi' associated with it.

The Open Relay Database - ordb.org is a well-known, free list of open relays. To use its services, add the following line to your main.cf
smtpd_client_restrictions = reject_rbl_client relays.ordb.org

Other anti-spam measures.There are other ways besides open relay databases to try to keep spam from getting into your mail queues. Referring back to the introduction to this section, the new trend in spam is to plant trojans on unsuspecting broadband users Windows machines. These trojans have an SMTP engine incorporated into them to the cracked Windows box becomes a mail server under control of the spammer. Postfix is designed to try to differentiate between a normal user trying to send legitimate mail and a spammer doing anything he can to move his junk around the net. In order to separate the spam from the sirloin, you can add these to the end of
your smtpd_client_restrictions = line
smtpd_sender_restriction = reject_unknown_sender_domain, 
reject_non_fqdn_hostname, reject_non_fqdn_sender, 
reject_invalid_hostname

Let's have a look at what these do. First, reject_unknown_sender_domain will reject any mail coming from a domain that doesn't have a registered MX DNS record. This means that the machine that this mail is coming from is not "authorized" (for lack of a better term) to send mail.reject_non_fqdn_hostname means that we will reject any server trying to connect to yours that doesn't identify itself as representing a Fully Qualified Domain Name. That is, some lousy spam spewer connects to your server and says: 'I am X.X.X.X' instead of 'I am mail.domain.com' and you tell it to take a hike. The nextone refers to the sender's address having an FQDN in the From: address. A lot of spam just flows out of horrendously made software that doesn't seem to be 'From' anybody. The last one makes sure that the mail comes from a valid domain.

You can also add some anti-spam rules to your body_checks and add another line to our main.cf to do some Perl checks on the headers as well. First, let's add a line to main.cf
header_checks = pcre:/etc/postfix/header_checks

Now we can create the corresponding file header_checks and add some rules to it.
/^Subject: FREE OFFER!!/ REJECT
/^From: spamking@lousyrottenspammer.com/  REJECT
/^X-Mailer:.*Some Spamware Tool/ REJECT
These are probably the best headers to use in your anti-spam rules.

If you keep adding to your header_checks and body_checks you will be able to get a handle on a lot of undesired mail. This is, of course, not 100% effective - but then again, nothing ever is.

Other Postfix administrative tasks


Using mail aliases
It's very common for organizations to have aliases. Let's say you're doing system administration work at an accounting firm. Your firm announces that all queries having to do with tax filing should be sent to taxes@creativeacct.com. Now, you have a bright young intern named Bob Ledger who handles all these questions and he already has an account: bledger@creativeacct.com. There's no reason for Bob the intern to have to be monitoring two accounts. We can just create taxes@creativeacct.com as an alias for his personal account like so:

First, there is a file in /etc called aliases. This is where we can define this alias for Bob. First of all, we would add a line in the file that looks like this:
taxes: bledger


Now, we invoke the postalias program to add this to our alias database
postalias /etc/aliases

As we can assume that Postfix is configured to handle the mail for the creativeacct.com domain, anything that comes for taxes@creativeacct.com gets bounced to Bob's mail spool. Then it would be up to Bob to configure his personal email program to filter these messages and deal with them as he sees fit.

Removing mail from the queue
At times, you may have to remove mail from the send queue. This normally happens when you send mail to some host that may have gone down for some period of time. Your mail logs will get cluttered up with periodic tries to re-send. If you see this and you really think the chances are slim that the mail is going to reach its destination, you can remove it from the queue. Each message has a unique queue-id. Here is an example:
Jan 15 13:01:30 mailserver postfix/smtp[31604]: 182D312D5F:
to=, relay=none, delay=375562, status=deferred
(Name service error for hotnail.com: Host not found, try again)

So, if you see messages that aren't going anywhere, then you can remove them from the queue using this ID number. First, shut down Postfix, to be safe. Then do the following
find /var/spool/postfix -name [queue-id] -print | xargs rm
find will (pardon the redundancy) find the mail with that ID number and pass it along to be removed. Then restart Postfix again.

Study, study and more study

We have really only scratched the surface of what running an email server entails. It's now up to you to take this basic, general knowledge and improve your skills at handling such an important task as email management. There are thousands of pages dedicated to running Postfix more efficiently. I'd advise you to take a look at Postfix's documentation very closely before you attempt to set up a mail server of your own. Also, look at sample configurations that people have posted on their websites. You can get some really excellent ideas from them - ideas that will save you a lot of time and trouble.