The (hard) job of IT engineers for dummies
I've been working in IT tech since... ever. I started too young, and I started because I like to learn new things. And that what drives my life. During all these years I spent time to understand how things were working; learning, because things are changing every day; teaching to new guys what I learned,... and trust me, it is not so easy as you can think. An IT guy is not (only) the person able to fix the problem on your printer; there are several different jobs and anyone required a specific skill. And personally, I'm not sure I will be able to fix your printer problem 😜
I decided to write this article because in my career I've always met people thinking that what we were doing was too easy (or at least it was my feeling)
What? I cannot believe that adding this function will take 2 weeks
System is failing again, why you didn't fix the problem?
Oh yes, I didn't remember to say to you. Tomorrow we planned to do XXX that will bring 2 times traffic than usual on our systems.
I could spend all the article writing examples like these, and I'm pretty sure, if you are reading here and you are working in IT, you are already thinking about something that happened to you. But the goal is to show you, with everyday examples, all the things IT engineers are taking care of in their job and why we must trust what they are doing.
You have a PC or Mac or again a mobile phone. I'm sure you have. And why after years you are spending money to change it? "Oh damn, it is very slow today, what is happening?".
This morning, as usual, you wanted to move the latest photos in your catalog and make light/brightness changes to some of them. Arrived at your desk with your coffee, opened LightRoom (NDA Sorry Adobe :)) and wait... wait... wait... What are doing here?
- Close all other opened programs on your PC/Mac
- Reboot it... usually it helps 😂
- Clean files?
- Call a friend? Then you realize that just yesterday you updated your Mac. Maybe it is related but... what can you do?
So, globally you could spend the whole day trying to fix your computer and without doing the activity you planned. Slowness reasons for computers are sometimes quite tricky to discover.
Not enough disk space
How many times in your life you had a problem like this one?
The PowerPoint presentation for your boss is almost finished. Just save it and make a complete review in the afternoon.... "Error saving the document. Not enough disk space". 😱 Oh damn once again. As this problem is something you know, you directly think to a solution:
- Empty your trash.
- Check and clean your download folder
- If it is not enough, clean everything you can because you really need to save it.
- Oh wait, you have a USB stick. That's great. Save a copy there and you will check about the disk space later.
In the afternoon you open back your presentation and... WHAT??? Why you cannot open your file anymore? Yes, sometimes happens. Disks, nevermind the kind of your disk, are failing and you lose your data. If it was the only copy of your presentation you had, I think you should spend the whole night rewriting it... I'm so sorry for you.
Internet: the thing which is saving our workdays in this LockDown period. We are in the connection ERA and we are spending most of our days connected. Social networking, email, news, ... But do we know everything about it?
Monday morning, 10 o'clock, all the family is at home because schools are closed and you are going to start your video meeting. One minute after you began you are completely unable to understand what the person in front of you is saying: it is laggy, the audio is cut, the video is stuck most of the time. What are your actions?
- Stop the video keeping only the audio part. It helped yesterday... but unfortunately not today.
- "Wait, I hang up and recall you in a second". The "magic recall fix". It is the new era reboot 😎
- Reboot your internet provider box?
- Reboot your wifi router?
- Oh yes. Share the 4G/5G connection from your phone.
During the launch time, you discover that your 3 children and your wife were looking to Netflix series altogether. That was the problem !! Your internet is not too fast to manage all these things at once. As you will have another call in the afternoon you ask all of them to please stop using Netflix between 2 and 3 o'clock. Then you will start your call but... once again nothing is working. "HEY GUYS STOP USING INTERNET!! 😠". They listened to you and they were not watching Netflix, but it is not the only thing which is largely using your internet connection: online gaming, youtube video, social network sharing/reading, IP TV, ...
You had an idea on how to fix the problem you found in the morning, but the afternoon one was a little bit different. Bringing you to the same failing result.
And what about when the problem is occurring on the internet service provider side? You lost the connection for un unknown reasons but it is not a thing you can control by yourself. The only thing you can do is call the ISP call center and: "Can you check your cable, please?" "Can we try to reboot the box? Is the left light always red?" "We are sorry, all seems working on our side and we are unable to identify the problem. A technical guy will come to you in 2 weeks!".
2 weeks when you are smart working 🤬... that's real life.
What is working without electricity (batteries or wired) today? I think it will be easy to explain this part. What do we do when you have an electrical problem?
- Check if your network was hardly loaded and shut down things.
- Check where it is down? Is it on my side? Is it everywhere in my street? And then? What about when you discovered the problem is coming from outside? You can call your electricity provider and then... wait :( Usually here are not 2 weeks because we are unable to live without electricity today.
IT Engineer (hard) Job
IT Engineers are taking care to all these things at once. At least onces working on internet/network-connected systems.
Imagine an e-commerce website. What is the one you know? Amazon. Yeah for sure 🤑. IT guys at Amazon should regularly check all these things. But Amazon is not working on a system like the PC or tablet you are using to read this article. You can imagine thousands PC like the one you are using (and you are far from the reality anyway) and all of them can fail for one of the reasons we talked about. Your Bank website? The same.
IT engineers spent the time to create/configure monitoring tools: they want to know before a problem is happening. They need to prevent problems to have systems working.
We can get back to the disk space example. You discovered you were out of space when you tried to save your document. How long does it take to find a solution? 2 minutes? 5 minutes? You had time for that and you were the only user on the PC. When you have thousands of computers that could run out of disk space, how to know which computer is the failing one? And if you are not "monitoring" your computers, you may know that "something is not working good" but you don't know exactly why. You need to check each of them to find that the problem is the missing disk space. How long it will take to check any of the thousands computers? Hours? Days? But you know, we were working at Amazon. How many customers are trying to buy during these hours/days? How much money you are losing because customers are not able to buy?
Definitely, you cannot wait for a problem to fix it; an IT guy is imagining all the problems and find a solution even before the problem occurs.
Sometimes, as we have seen in the internet connection example, the problem is not coming in the same way you planned or you already found. You thought about the possible problem, and you know how to fix it when it will occur (do not use Netflix during a video call, for example) but the system is failing (and nobody is using Netflix). So what? At this time you have to find a solution... and before you can find the solution you need to understand what is the problem. And time is running out...
As we have seen, problems can come from things you are not able to control. In the examples, we talked about the electrical system in your street or the internet provider failure. It is the same on the systems the IT engineers are working on... we can have electrical and internet problems. But not only. Systems are very complex, do you remember the thousands of computer? Each one with a cable connected to the internet box (it is quite this 😂), each one with electrical cable, a disk, ...
Do you know where are all these computers today? Usually, not where the IT engineers are working. They cannot check the cables by themself. They cannot reboot the internet box using the power off button. Have you ever heard of Cloud Computing? Yeah, most of the IT Engineers computers today are moving to the Cloud. Somewhere, but we don't exactly know where (it is not always too important), there is a place where these thousands of computers are located, linked to the internet and to the power. And just there you will find other IT engineers doing exactly the same job to ensure that the thousands of computers work.
Then you have IT engineers which are installing software on these computers and will ensure that it is always working as expected. Which is not slowing down, as the photo catalog; it is always secured, because hackers are out there spending time to find holes in computers. You will find IT engineers at different levels of a system, each one with a specific role and knowledge. And at a moment, you will meet even the one able to repair your printer.
I don't know if I reached my goal and if it is much more simple to understand what we are doing to provide an IT service (even to allow you to read this article there is a computer someone is working on!). Yes, because I'm one of those engineers working on an e-commerce system. I know sometimes is frustrating to see that we can't do what we planned because "a system is failing", but now you know. When a system is failing you surely have, somewhere in the world, a group of IT Engineers hardly working to try to find a solution, because customer satisfaction is the first thing matter. The system is failing because something they didn't plan happened; because something they didn't monitor failed; ... All these engineers are learning new things at this time; they are learning how to prevent the same problem in the future. So when the system is failing again it is not because they didn't do their job (yes sometimes this happens too, unfortunately). Something new happened. A new challenge and other things to learn to be better.