Sameh Attia

Thursday, May 9, 2013

One Of The Most Important Tools In Linux – Understanding Chmod

http://www.makeuseof.com/tag/one-of-the-most-important-tools-in-linux-understanding-chmod

There are plenty of features that make Linux special, but one of them that makes it so secure is its permissions system. You can have fine-grain control over all the files in your system and assign permissions to users, groups, and everyone else. The terminal utility “chmod” helps you control all the permissions on your system, so it’s vital to know how chmod works in order to get the most use out of this feature, especially if you’re planning on building your own Linux server.
There’s plenty of information that you’ll need to know in order to understand the mechanics of the permissions system and control it as you please, so get ready to take some notes. Additionally, for starters, it’s best to take a look at 40 terminal commands that you should be familiar with before diving in.

Components Of Permissions

The Linux permissions system is configured in such a way that you can assign file and directory permissions to three different categories – the user, the group, and everyone else. Each file or directory is owned by a user and group, and these fields cannot be empty. If only the user should own the file, then the group name is often the same as the username of the owner.
You can assign specific permissions to the owner, different permissions to the group, and even other permissions to every other user. The different permissions which you can assign to any of these three categories are:

Ads by Google

Cable Volt Drop Problems? Eliminate Voltage Drop in long cable runs and Save on cable costs
www.AshleyEdison.com

RHCE-RHCSS-RHCA in India Developer, Scripting & LAMP Courses Advanced Courses for Administrators
www.linuxlearningcentre.com

التدريب بمعاملنا والمصانع لخريجي وطلبة هندسة وحاسب-أفق-خبراء تدريب تخصصات الكهرباء وتصميم الويب
www.ofoq-ct.com

Start Download Download Free Software: Converter Free Download!
www.Donwload.pconverter.com

read – 4 – ‘r’
write – 2 – ‘w’
execute – 1 – ‘x’

The numbers 4, 2, and 1 as well as the letters r, w, and x are different ways in which you can assign permissions to a category. I’ll get to why these numbers and letters important later on.
Permissions are important because, as you might assume, they allow certain people to do certain things with the file. Read permissions allow the person or group to read the contents of the file, and copy it if they wish. Write permissions allows the person or group to write new information into the file, or overwrite it completely. In some cases this can also control who is allowed to delete the file; otherwise a sticky bit must be used that won’t be covered here. Finally, execute permissions allow the person or group to run the file as an executable, whether it’s a binary file, an .sh file, or anything else.

Understanding Assigned Permissions

Let’s go in your terminal to any folder on your system – say your Home folder. Go ahead and type in the command ls -l and hit enter. This command lists out all of the files and directories found in whatever folder you’re currently in.
Each line represents a file or directory, and it begins with something that might look like -rw-rw-r–. This shows you the permissions of the file or directory. In this case, the first dash shows us that you’re looking at a file. If it were a directory, there would be a “d” in this spot. The next three spots, rw-, shows us that the user who owns the file has read and write permissions (rw), but no executable permissions as there’s a dash instead of an “x”. The same is repeated for the next three spots, which represents the permissions of the group that owns the file.
Finally, the last three spots are r–, which means that everybody else can only read the file. As a reference, the possible permissions are drwxrwxrwx. It’s also important to note the “dmaxel dmaxel” that you see after the permissions. This shows that the user owner of the file is dmaxel and the group owner is dmaxel. For files that really are only supposed to belong to one user, this is default behavior, but if you’re sharing with a group that has multiple members, then you’ll be able to see that.

Assigning New Permissions

Remember the numbers and letters I mentioned earlier? Here’s where you’ll need them. Let’s say you have a file called “important_stuff” that’s located at the path /shared/Team1/important_stuff. As the team leader, you’ll want to be able to read and write to the file, your group members should only be allowed to read the file, and everyone else shouldn’t have any permissions at all.
In order to make sure that you and your group own the file, you’ll need to run the command chown. An appropriate command for this situation would be chown me:Team1 /shared/Team1/important_stuff. That command runs chown, and tells it that the file at path /shared/Team1/important_stuff should belong to the user “me” and the group “Team1″.
It’s assumed that the desired group has been created and that members have the group added as a secondary group in the system (also not covered here). Now that you have set the owner and group, you can set the permissions. Here, you can use the command chmod 640 /shared/Team1/important_stuff. This starts chmod, and assigns the permissions 640 to the file at path /shared/Team1/important_stuff.
Where did 640 come from? You look at the numbers represented by the different commands – for read and write permissions, you have 4 + 2 = 6. The 6 represents the permissions for the user. The 4 comes from just the read permissions for the group, and the 0 comes from no permissions for everyone else. Therefore, you have 640. The number system is very good because you can have a number for all possible combinations: none (0), x (1), w (2), r (4), rx (5), rw (6), and rwx (7).
As an example, full permissions for everyone would be 777. However, if you have security in mind, its best to assign only the permissions that you absolutely need – 777 should be used rarely, if at all.

Alternative Method

While I prefer the number method of assigning permissions, you can increase your flexibility and also add or remove permissions using the representative letters. For the above situation, the command used could also be chmod u=rw,g=r,o= /shared/Team1/important_stuff. Here, u=rw assigns read and write permissions to the user, g=r assigns read permissions to the group, and o= assigns no permissions to everyone else. There’s also ‘a’ which can assign the same permissions for all categories.
You can also combine different combinations for varying permissions, as well as + or – signs instead of =, which would simply add or remove permissions if they haven’t already been added/removed instead of completely overwriting the permissions that you’re changing.
So, different examples can include:

chmod a+x /shared/Team1/important_stuff assigns execute permissions to everyone if they don’t have it already
chmod ug=rw o-w /shared/Team1/important_stuff forces the user and group to just have read and write permissions, and takes away writing permissions for everyone else in case they had it.

Applying Permissions To Multiple Files

Additionally, you can add the -R flag to the command in order to recursively apply the same permissions to multiple files and directories within a directory. If you wanted to change the permissions of the Team1 folder and all files and folders within, you can run the command chmod 640 -R /shared/Team1.
Applying the same permissions to multiple, but individually picked files can be done with a command such as chmod 640 /shared/Team1/important_stuff /shared/Team1/presentation.odp.

Conclusion

Hopefully, these tips have helped you improve your knowledge of the permissions system found in Linux. Security is an important matter to consider, especially on mission-critical machines, and using chmod is one of the best ways to keep security tight. While this is a fairly in-depth look at using chmod, there’s still a bit more that you can do with it, and there are plenty of other utilities that complement chmod. If you need a place to start, I would suggest doing more research on all of the things you can do with chown.
If you’re just getting started with Linux, have a look at our Getting Started Guide to Linux.
Are file permissions important for you? What permissions tips do you have for others? Let us know in the comments!

Wednesday, May 8, 2013

4 Hot Open Source Big Data Projects

http://www.enterpriseappstoday.com/data-management/4-hot-open-source-big-data-projects.html

There's more -- much more -- to the Big Data software ecosystem than Hadoop. Here are four open source projects that will help you get big benefits from Big Data.
It's difficult to talk about Big Data processing without mentioning Apache Hadoop, the open source Big Data software platform. But Hadoop is only part of the Big Data software ecosystem. There are many other open source software projects that are emerging to help you get more from Big Data.
Here are a few interesting ones that are worth keeping an eye on.

Spark

Spark bills itself as providing "lightning-fast cluster computing" that makes data analytics fast to run and fast to write. It's being developed at UC Berkeley AMPLab and is free to download and use under the open source BSD license.
So what does it do? Essentially it's an extremely fast cluster computing system that can run data in memory. It was designed for two applications where keeping data in memory is an advantage: running iterative machine learning algorithms, and interactive data mining.
It's claimed that Spark can run up to 100 times faster than Hadoop MapReduce in these environments. Spark can access any data source that Hadoop can access, so you can run it on any existing data sets that you have already set up for a Hadoop environment.
Download Spark

Drill

Apache Drill is "a distributed system for interactive analysis of large-scale datasets."
MapReduce is often used to perform batch analysis on Big Data in Hadoop, but what if batch processing isn't suited to the task at hand: What if you want fast results to ad-hoc queries so you can carry out interactive data analysis and exploration?
Google developed its own solution to this problem for internal use with Dremel, and you can access Dremel as a service using Google's BigQuery.
However if you don't want to use Google's Dremel on a software-as-a-service basis, Apache is backing Drill as an Incubation project. It's based on Dremel, and its design goal is to scale to 10,000 servers or more and to be able to process petabytes of data and trillions of records in seconds.
Download Drill source code

D3.js

D3 stands for Data Driven Documents, and D3.js is an open source JavaScript library which allows you to manipulate documents that display Big Data. It was developed by New York Times graphics editor Michael Bostock.
Using D3.js you can create dynamic graphics using Web standards like HTML5, SVG and CSS. For example, you can generate a plain old HTML table from an array of numbers, but more impressively you can make an interactive bar chart using scalable vector graphic from the same data.
That barely scratches the surface of what D3 can do, however. There are dozens of visualization methods -- like chord diagrams, bubble charts, node-link trees and dendograms -- and thanks to D3's open source nature, new ones are being contributed all the time.
D3 has been designed to be extremely fast, it supports Big Data datasets, and it has cross-hardware platform capability. That's meant it has become an increasingly popular tool for showing graphical visualizations of the results of Big Data analysis. Expect to see more of it in the coming months.
Download D3.js

HCatalog

HCatalog is an open source metadata and table management framework that works with Hadoop HDFS data, and which is distributed under the Apache license. It's being developed by some of the engineers at Hortonworks, the commercial organization that's the sponsor of Hadoop (and which also sponsors Apache).
The idea of HCatalog is to liberate Big Data by allowing different tools to share Hive. That means that Hadoop users making use of a tool like Pig or MapReduce or Hive have immediate access to data created with another tool, without any loading or transfer steps. Essentially it makes the Hive metastore available to users of other tools on Hadoop by providing connectors for Map Reduce and Pig. Users of those tools can read data from and write data to Hive’s warehouse.
It also has a command line tool, so that users who do not use Hive can operate on the metastore with Hive Data Definition Language statements.
Download HCatalog

And More

Other open source big data projects to watch:
Storm. Storm makes it easy to reliably process unbounded streams of data, doing for real-time processing what Hadoop did for batch processing.
Kafka. Kafka is a messaging system that was originally developed at LinkedIn to serve as the foundation for LinkedIn's activity stream and operational data processing pipeline. It is now used at a variety of different companies for various data pipeline and messaging uses.
Julia. Julia is a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments.
Impala. Cloudera Impala is a distributed query execution engine that runs against data stored natively in Apache HDFS and Apache HBase.

Unix: Timing your cron jobs

http://www.itworld.com/operating-systems/354358/unix-timing-your-cron-jobs

Most of my Unix admin cronies cut their teeth on tools like cron. There is nearly nothing as fundamentally essential to administering Unix systems are writing scripts and then setting them up to run without intervention. Even so, cron has become more versatile over the decades and there are a lot of nice "tricks" that you can use to tailor your cron tasks to the work you need to do.
The traditional fields are fairly easy to remember -- as long as you can keep in mind that the smallest time unit that you can address is minutes and that the fields go from small units (minute) to large (month) until you hit the day of week field. I still often tick off the time fields on the fingers of my left hand. Let's see ...

               ____
            .'`  __/_______ minute after the hour (0-59)
        ---'  -'`    ______) hour of the day (0-23)
                     _______) day of the month (1-X where X depends on the month)
                    _______) month of the year (1-12)
        -----..___________)   day of the week (0-6)

And, of course, an * in any of these time fields to mean "any" of the legitimate values

As an example, this cron taks would run every 15 minutes after the hour:

15 * * * * /usr/local/runtask

Easy enough! But there are many options that make scheduling tasks to run very frequently or very infrequently even easier.
For one thing, you can tell cron to run a process multiple times an hour, multiple times within a day, or even multiple times in a week by selecting two values and separating them with a comma. The value 6,18 in the hour field, for example, might tell cron to run a task at 6 AM and 6 PM:

0 6,18 * * * /usr/local/runtask

where this one would tell cron to run the process at 6:15 AM, 6:45 AM, 6:15 PM and 6:45 PM:

15,45 6,18 * * * /usr/local/runtask

You can also tell cron to run your tasks over a range of values -- such as Monday through Friday (and not on the weekends) or from 8 AM to 6 PM as shown here:

0 23 * * 1-5 /usr/local/weekdays
0 8-18 * * 1-5 /usr/local/workhours

And, while I've seen numerous instances of "0,15,30,45" used to run a job every 15 minutes, the */15 specification does the same thing and is neater and easier.
I was also quite surprised when I first noticed that the strings sun, mon, tue, wed, thu, fri and sat also work for the day of the week field. I've become so used to 0-6 that I sometimes refer to Saturdays as "day 6" and confuse my friends.
Linux systems also provide some very useful shorthands for running tasks at infrequent intervals. Instead of the five time fields shown above, you can use any of these:
@yearly runs on January 1st at zero at 00:00 (equivalent to 0 0 1 1 *)
@monthly runs on the first of every month at 00:00 (equivalent to 0 0 1 * *)
@daily runs at the beginning of every day (equivalent to 0 0 * * *)
@hourly runs every hour (equivalent to 0 * * * *)
@reboot runs when the system boots
These top four options are very handy as long as you're happy to run your tasks at 00:00. Otherwise, you need to
set your time values using the typical five fields.
Cron allows you to run a task as frequently as once a minute and as infrequently as once every 6-7 years (e.g., if you run a process at 6 AM on the 13th of May only if it's a Friday).

* * * * * /usr/local/everyminute
10 10 13 5 5 /usr/local/almostnever

Anther useful thing to know in case you're new to cron is that, by setting your EDITOR environment variable, you can tell crontab -- the command you must use to edit or display your cron tasks -- which editor you want to use to set up your cron tasks. This will normally default to vi.