Digital Forensics Guide

superior_hosting_service

Forensics

A guide covering Digital Forensics the applications, libraries and tools that will make you a better and more efficient with Digital Forensics development.


Note: You can easily convert this markdown file to a PDF in VSCode using this handy extension Markdown PDF.

The Digital Forensics Guide is a github repository by Michael Royal

Digital Forensics Learning Resources

Digital Forensics is the process of recovering and preserving material found on digital devices during the course of criminal investigations. Digital forensics tools include hardware and software tools used by law enforcement to collect and preserve digital evidence and support or refute hypotheses before courts.

Computer Forensics is the process of examining digital media in a forensic-like manner with the goal of identifying, preserving, recovering, analyzing and presenting facts and opinions about the digital information.

Mobile device forensics is the science of recovering digital evidence from a mobile device under forensically sound conditions using accepted methods. Mobile device forensics is an evolving specialty in the field of digital forensics.

Network forensics is a science that centers on the discovery and retrieval of information surrounding a cybercrime within a networked environment. Common forensic activities include the capture, recording and analysis of events that occurred on a network in order to establish the source of cyberattacks.

Database forensics is the process of interrogating a failed database and trying to reconstruct the metadata and page information from within a data set, whereas database recovery implies some kind of restorative process that will enable the database to become viable enough to be put back into a production environment, or healthy enough to provide a backup that can be used in a database restore.

Computer Forensics Training Courses | Udemy

Computer Forensics Courses | Coursera

Learn Computer Forensics with Online Courses and Lessons | edX

Computer Forensics Courese Learning Path – Infosec Institute

National Computer Forensics Institute(NCFI) Training Courses

Computer Forensics Training and Courses | X-Ways

Mile2’s Certified Digital Forensics Examiner training course

Cyber Security Training, Certifications, Degrees and Resources | SANS Institute

IACIS – BCFE: Basic Computer Forensic Examiner course

Digital Forensics Tools, Libraries, and Frameworks

Autopsy® is a digital forensics platform and graphical interface to The Sleuth Kit® and other digital forensics tools. It is used by law enforcement, military, and corporate examiners to investigate what happened on a computer. You can even use it to recover photos from your camera’s memory card.

The Sleuth Kit® (TSK) is a library and collection of command line tools that allow you to investigate disk images. The core functionality of TSK allows you to analyze volume and file system data. The library can be incorporated into larger digital forensics tools and the command line tools can be directly used to find evidence.

PTK Forensics is a computer forensic framework for the command line tools in the SleuthKit plus much more software modules.

DFF (Digital Forensics Framework) is a free and Open Source computer forensics software built on top of a dedicated Application Programming Interface (API). It can be used both by professional and non-expert people in order to quickly and easily collect, preserve and reveal digital evidences without compromising systems and data.

Mobile Device Investigator® is a security tool that powers rapid investigations of iOS and Android devices by connecting a suspect device via USB port to perform logical acquisitions.

Digital Evidence Investigator® is a digital forensic tool for Windows, Linux, and macOS (including T2 and M1 chips). DEI collects digital evidence and presents it in a timeline view to tie the user to files and artifacts.

Digital Evidence Investigator® PRO is a tool that includes Windows, Linux and macOS (including T2 and M1 chips) computer forensic capabilities of Digital Evidence Investigator® and Mobile Device Investigator® iOS/Android capabilities in a single license.

Guymager is a free forensic imager for media acquisition. Its main features are: Easy user interface in different languages. Really fast, due to multi-threaded, pipelined design and multi-threaded data compression. Generates flat (dd), EWF (E01) and AFF images, supports disk cloning. Free of charges, completely open source.

X-Ways Forensics is a commercial digital forensics platform for Windows.

X-Ways Investigator is a reduced, simplified version of X-Ways Forensics for police investigators, lawyers, and auditors.

WinHex is a Hex editor, disk editor, and RAM editor. Computer forensics, data recovery, and IT security tool.

F-Response is a remote network drive analysis capability, remote RAM access, and cloud storage access.

AccessData Forensics Toolkit (FTK®) is built for speed, stability and ease of use. It provides comprehensive processing and indexing up front, so filtering and searching is faster than with any other product. This means you can zero in on the relevant evidence quickly, dramatically increasing your analysis speed.

OpenText™ EnCase™ is a commercial forensics platform. It offers support for evidence collection from over twenty-five different types of devices, including desktops, mobile devices and GPS. Within the tool, a forensic investigator can inspect the collected data and generate a wide range of reports based upon predefined templates.

Redline® is FireEye’s premier free endpoint security tool, provides host investigative capabilities to users to find signs of malicious activity through memory and file analysis and the development of a threat assessment profile. It collects information about running processes on a host, drivers from memory and gathers other data like meta data, registry data, tasks, services, network information and internet history to build a proper report.

Paraben’s Electronic Evidence Examiner—E3 is a comprehensive digital forensic platform designed to handle more data, more efficiently while adhering to Paraben’s paradigm of specialized focus of the entire forensic exam process. Paraben has capabilities in:

  • Desktop forensics
  • Email forensics
  • Smartphone analysis
  • Cloud analysis
  • IoT forensics
  • Triage and visualization

Bulk Extractor is a program that extracts features such as email addresses, credit card numbers, URLs, and other types of information from digital evidence files. It is a useful forensic investigation tool for many tasks such as malware and intrusion investigations, identity investigations and cyber investigations, as well as analyzing imagery and pass-word cracking.

Registry Recon is a powerful computer forensics tool developed by Arsenal Recon. The tool is used to extract, recover, and parse registry data from Windows systems. The process of manually scouring Windows Registry files proves to be extremely time consuming and leaves gaping holes in the ability to recover critical information.

Volatility is the memory forensics framework. It is used for incident response and malware analysis. With this tool, you can extract information from running processes, network sockets, network connection, DLLs and registry hives. It also has support for extracting information from Windows crash dump files and hibernation files.

WindowsSCOPE is a commercial memory forensics and reverse engineering tool used for analyzing volatile memory. It is basically used for reverse engineering of malware. It provides the ability to analyze the Windows kernel, drivers, DLLs and virtual and physical memory.

Wireshark is the most widely used network traffic analysis tool in existence. It has the ability to capture live traffic or ingest a saved capture file.

Network Miner is an open source Network Forensic Analysis Tool (NFAT) for Windows (also Linux, macOS X , and FreeBSD). NetworkMiner can be used as a passive network sniffer/packet capturing tool in order to detect operating systems, sessions, hostnames, and open ports without putting any traffic on the network.

Xplico is an open-source network forensic analysis tool. It is used to extract useful data from applications which use Internet and network protocols. It supports most of the popular protocols including HTTP, IMAP, POP, SMTP, SIP, TCP, UDP, TCP and others. Output data of the tool is stored in an SQLite database or MySQL database. It also supports both IPv4 and IPv6.

Oxygen Forensic Detective is a forensic’s tool that focuses on mobile devices but is capable of extracting data from a number of different platforms, including mobile, IoT, cloud services, drones, media cards, backups and desktop platforms. It uses physical methods to bypass device security (such as screen lock) and collects authentication data for a number of different mobile applications.

XRY is a collection of different commercial tools for mobile device forensics. XRY Logical is a suite of tools designed to interface with the mobile device operating system and extract the desired data. XRY Physical, on the other hand, uses physical recovery techniques to bypass the operating system, enabling analysis of locked devices.

SIFT Workstation is another open-source Linux virtual machine that aggregates free digital forensics tools. This platform was developed by the SANS Institute and its use is taught in a number of their courses.

HashKeeper is a central database repository of Forensic Intelligence donated by various sources, usually obtained by law enforcement during the course of forensic investigations of suspect systems.

Forensic Explorer Command Line (FEX CLI) is a forensic data processing engine used for computer forensics and electronic discovery. The FEX CLI can be run on a single workstation to an enterprise level virtual environment spawning multiple simultaneous processing instances.

FEX Memory Imager (FEX Memory) is a free imaging tool designed to capture the physical Random Access Memory (RAM) of a suspect’s running computer. This allows investigators to recover and analyze valuable artifacts found only in memory.

FEX Imager™ is a free forensic imaging program that will acquire or hash a bit-level forensic image with full MD5, SHA1, SHA256 hash authentication. It can acquire a physical drive, logical drive, folders and files, remote devices (using servlet), or re-acquire a forensic image.

Forensic Explorer™ is a flexible and easy to use GUI with advanced sort, filter, keyword search, data recovery and script technology. It can quickly process large volumes of data, automate complex investigation tasks, produce detailed reports and increase productivity.

Rehex is a cross-platform (Windows, Linux, Mac) hex editor for reverse engineering, and everything else.

DIRTY is a Augmenting Decompiler Output with Learned Variable Names and Types developed by the Socio-Technical Research Using Data Excavation Lab, at Carnegie Mellon University.

Virtualization

HVM (Hardware Virtual Machine) is a virtualization type that provides the ability to run an operating system directly on top of a virtual machine without any modification, as if it were run on the bare-metal hardware.

PV(ParaVirtualization) is an efficient and lightweight virtualization technique introduced by the Xen Project team, later adopted by other virtualization solutions. PV does not require virtualization extensions from the host CPU and thus enables virtualization on hardware architectures that do not support Hardware-assisted virtualization.

Virtualization-based Security (VBS) is a hardware virtualization feature to create and isolate a secure region of memory from the normal operating system.

Hypervisor-Enforced Code Integrity (HVCI) is a mechanism whereby a hypervisor, such as Hyper-V, uses hardware virtualization to protect kernel-mode processes against the injection and execution of malicious or unverified code. Code integrity validation is performed in a secure environment that is resistant to attack from malicious software, and page permissions for kernel mode are set and maintained by the hypervisor.

KVM (for Kernel-based Virtual Machine) is a full virtualization solution for Linux on x86 hardware containing virtualization extensions (Intel VT or AMD-V). It consists of a loadable kernel module, kvm.ko, that provides the core virtualization infrastructure and a processor specific module, kvm-intel.ko or kvm-amd.ko.

QEMU is a fast processor emulator using a portable dynamic translator. QEMU emulates a full system, including a processor and various peripherals. It can be used to launch a different Operating System without rebooting the PC or to debug system code.

Hyper-V enables running virtualized computer systems on top of a physical host. These virtualized systems can be used and managed just as if they were physical computer systems, however they exist in virtualized and isolated environment. Special software called a hypervisor manages access between the virtual systems and the physical hardware resources. Virtualization enables quick deployment of computer systems, a way to quickly restore systems to a previously known good state, and the ability to migrate systems between physical hosts.

VirtManager is a graphical tool for managing virtual machines via libvirt. Most usage is with QEMU/KVM virtual machines, but Xen and libvirt LXC containers are well supported. Common operations for any libvirt driver should work.

oVirt is an open-source distributed virtualization solution, designed to manage your entire enterprise infrastructure. oVirt uses the trusted KVM hypervisor and is built upon several other community projects, including libvirt, Gluster, PatternFly, and Ansible.Founded by Red Hat as a community project on which Red Hat Enterprise Virtualization is based allowing for centralized management of virtual machines, compute, storage and networking resources, from an easy-to-use web-based front-end with platform independent access.

HyperKit is a toolkit for embedding hypervisor capabilities in your application. It includes a complete hypervisor, based on xhyve/bhyve, which is optimized for lightweight virtual machines and container deployment. It is designed to be interfaced with higher-level components such as the VPNKit and DataKit. HyperKit currently only supports macOS using the Hypervisor.framework making it a core component of Docker Desktop for Mac.

Intel® Graphics Virtualization Technology (Intel® GVT) is a full GPU virtualization solution with mediated pass-through, starting from 4th generation Intel Core (TM) processors with Intel processor graphics(Broadwell and newer). It can be used to virtualize the GPU for multiple guest virtual machines, effectively providing near-native graphics performance in the virtual machine and still letting your host use the virtualized GPU normally.

Apple Hypervisor is a frameowrk that builds virtualization solutions on top of a lightweight hypervisor, without third-party kernel extensions. Hypervisor provides C APIs so you can interact with virtualization technologies in user space, without writing kernel extensions (KEXTs). As a result, the apps you create using this framework are suitable for distribution on the Mac App Store.

Apple Virtualization Framework is a framework that provides high-level APIs for creating and managing virtual machines on Apple silicon and Intel-based Mac computers. This framework is used to boot and run a Linux-based operating system in a custom environment that you define. It also supports the Virtio specification, which defines standard interfaces for many device types, including network, socket, serial port, storage, entropy, and memory-balloon devices.

Apple Paravirtualized Graphics Framework is a framework that implements hardware-accelerated graphics for macOS running in a virtual machine, hereafter known as the guest. The operating system provides a graphics driver that runs inside the guest, communicating with the framework in the host operating system to take advantage of Metal-accelerated graphics.

Cloud Hypervisor is an open source Virtual Machine Monitor (VMM) that runs on top of KVM. The project focuses on exclusively running modern, cloud workloads, on top of a limited set of hardware architectures and platforms. Cloud workloads refers to those that are usually run by customers inside a cloud provider. Cloud Hypervisor is implemented in Rust and is based on the rust-vmm crates.

VMware vSphere Hypervisor is a bare-metal hypervisor that virtualizes servers; allowing you to consolidate your applications while saving time and money managing your IT infrastructure.

Xen is focused on advancing virtualization in a number of different commercial and open source applications, including server virtualization, Infrastructure as a Services (IaaS), desktop virtualization, security applications, embedded and hardware appliances, and automotive/aviation.

Ganeti is a virtual machine cluster management tool built on top of existing virtualization technologies such as Xen or KVM and other open source software. Once installed, the tool assumes management of the virtual instances (Xen DomU).

Packer is an open source tool for creating identical machine images for multiple platforms from a single source configuration. Packer is lightweight, runs on every major operating system, and is highly performant, creating machine images for multiple platforms in parallel. Packer does not replace configuration management like Chef or Puppet. In fact, when building images, Packer is able to use tools like Chef or Puppet to install software onto the image.

Vagrant is a tool for building and managing virtual machine environments in a single workflow. With an easy-to-use workflow and focus on automation, Vagrant lowers development environment setup time, increases production parity, and makes the “works on my machine” excuse a relic of the past. It provides easy to configure, reproducible, and portable work environments built on top of industry-standard technology and controlled by a single consistent workflow to help maximize the productivity and flexibility of you and your team.

Parallels Desktop is a Desktop Hypervisor that delivers the fastest, easiest and most powerful application for running Windows/Linux on Mac (including the new Apple M1 chip) and ChromeOS.

VMware Fusion is a Desktop Hypervisor that deliver desktop and ‘server’ virtual machines, containers and Kubernetes clusters to developers, and IT professionals on the Mac.

VMware Workstation is a hosted hypervisor that runs on x64 versions of Windows and Linux operating systems; it enables users to set up virtual machines on a single physical machine, and use them simultaneously along with the actual machine.

File systems

GlusterFS is a free and open source scalable network filesystem. Gluster is a scalable network filesystem. Using common off-the-shelf hardware, you can create large, distributed storage solutions for media streaming, data analysis, and other data- and bandwidth-intensive tasks.

Ceph is a software-defined storage solution designed to address the object, block, and file storage needs of data centers adopting open source as the new norm for high-growth block storage, object stores and data lakes. Ceph provides enterprise scalable storage while keeping CAPEX and OPEX costs in line with underlying bulk commodity disk prices.

Hadoop Distributed File System (HDFS) is a distributed file system that handles large data sets running on commodity hardware. It is used to scale a single Apache Hadoop cluster to hundreds (and even thousands) of nodes. HDFS is one of the major components of Apache Hadoop, the others being MapReduce and YARN.

ZFS is an enterprise-ready open source file system and volume manager with unprecedented flexibility and an uncompromising commitment to data integrity.

OpenZFS is an open-source storage platform. It includes the functionality of both traditional file systems and volume manager. It has many advanced features including:

  • Protection against data corruption.
  • Integrity checking for both data and metadata.
  • Continuous integrity verification and automatic “self-healing” repair.

Btrfs is a modern copy on write (CoW) filesystem for Linux aimed at implementing advanced features while also focusing on fault tolerance, repair and easy administration. Its main features and benefits are:

  • Snapshots which do not make the full copy of files
  • RAID – support for software-based RAID 0, RAID 1, RAID 10
  • Self-healing – checksums for data and metadata, automatic detection of silent data corruptions

Squashfs is a compressed read-only filesystem for Linux. It uses zlib, lz4, lzo, or xz compression to compress files, inodes and directories. Inodes in the system are very small and all blocks are packed to minimize data overhead.

Apple File System (APFS) is the default file system for Mac computers using macOS 10.13 or later, features strong encryption, space sharing, snapshots, fast directory sizing, and improved file system fundamentals.

NTFS(New Technology File System) is the primary file system for recent versions of Windows and Windows Server—provides a full set of features including security descriptors, encryption, disk quotas, and rich metadata, and can be used with Cluster Shared Volumes (CSV) to provide continuously available volumes that can be accessed simultaneously from multiple nodes of a failover cluster.

exFAT(Extended File Allocation Table ) is the file system that was the successor to FAT32 in the FAT family of file systems. It was optimized for flash memory such as USB flash drives and SD cards.

Security Tools and Frameworks

Security Standards, Frameworks and Benchmarks

STIGs Benchmarks – Security Technical Implementation Guides

CIS Benchmarks – CIS Center for Internet Security

NIST – Current FIPS

ISO Standards Catalogue

Common Criteria for Information Technology Security Evaluation (CC) is an international standard (ISO / IEC 15408) for computer security. It allows an objective evaluation to validate that a particular product satisfies a defined set of security requirements.

ISO 22301 is the international standard that provides a best-practice framework for implementing an optimised BCMS (business continuity management system).

ISO27001 is the international standard that describes the requirements for an ISMS (information security management system). The framework is designed to help organizations manage their security practices in one place, consistently and cost-effectively.

ISO 27701 specifies the requirements for a PIMS (privacy information management system) based on the requirements of ISO 27001. It is extended by a set of privacy-specific requirements, control objectives and controls. Companies that have implemented ISO 27001 will be able to use ISO 27701 to extend their security efforts to cover privacy management.

EU GDPR (General Data Protection Regulation) is a privacy and data protection law that supersedes existing national data protection laws across the EU, bringing uniformity by introducing just one main data protection law for companies/organizations to comply with.

CCPA (California Consumer Privacy Act) is a data privacy law that took effect on January 1, 2020 in the State of California. It applies to businesses that collect California residents’ personal information, and its privacy requirements are similar to those of the EU’s GDPR (General Data Protection Regulation).

Payment Card Industry (PCI) Data Security Standards (DSS) is a global information security standard designed to prevent fraud through increased control of credit card data.

SOC 2 is an auditing procedure that ensures your service providers securely manage your data to protect the interests of your comapny/organization and the privacy of their clients.

NIST CSF is a voluntary framework primarily intended for critical infrastructure organizations to manage and mitigate cybersecurity risk based on existing best practice.

Security Tools

Netdata is high-fidelity infrastructure monitoring and troubleshooting, real-time monitoring Agent collects thousands of metrics from systems, hardware, containers, and applications with zero configuration. It runs permanently on all your physical/virtual servers, containers, cloud deployments, and edge/IoT devices, and is perfectly safe to install on your systems mid-incident without any preparation.

IDA Pro(Interactive DisAssembler Professional) is a programmable and multi-processor disassembler combined with a local/remote debugger and along with a complete plugin programming environment. It’s a great tool for testing and discovering security vulnerabilities.

Ghidra is a software reverse engineering (SRE) framework developed by NSA’s Research Directorate for NSA’s cybersecurity mission. It helps analyze any malicious code and malware like viruses, and can give cybersecurity professionals a better understanding of potential vulnerabilities in their networks and systems.

DataWave is an ingest/query framework that leverages Apache Accumulo to provide fast, secure data access.

Emissary is a P2P based data-driven workflow engine that runs in a heterogeneous possibly widely dispersed, multi-tiered P2P network of compute resources. Workflow itineraries are not pre-planned as in conventional workflow engines, but are discovered as more information is discovered about the data.

MADCert is a cross-platform tool that consists of a certificate generator, a file system certificate manager, and a command line interface for the purposes of testing.

BLESS(Bastion’s Lambda Ephemeral SSH Service) is an SSH Certificate Authority that runs as an AWS Lambda function and is used to sign SSH public keys.

Zuul is an L7 application gateway that provides capabilities for dynamic routing, monitoring, resiliency, security, and more.

Chaos Monkey is a resiliency tool that helps applications tolerate random instance failures. It is fully integrated with Spinnaker, the continuous delivery platform. Chaos Monkey will work with any backend that Spinnaker supports (AWS, Google Compute Engine, Azure, Kubernetes, Cloud Foundry).

Priam is a tool/process for backup/recovery, Token Management, and Centralized Configuration management for Cassandra.

Vector is an on-host performance monitoring framework which exposes hand picked high resolution metrics to every engineer’s browser.

Control Groups(Cgroups) is a Linux kernel feature that allows you to allocate resources such as CPU time, system memory, network bandwidth, or any combination of these resources for user-defined groups of tasks (processes) running on a system.

Libgcrypt is a general purpose cryptographic library originally based on code from GnuPG.

Aircrack-ng is a network software suite consisting of a detector, packet sniffer, WEP and WPA/WPA2-PSK cracker and analysis tool for 802.11 wireless LANs. It works with any wireless network interface controller whose driver supports raw monitoring mode and can sniff 802.11a, 802.11b and 802.11g traffic.

Burp Suite is a leading range of cybersecurity tools.

Cilium uses eBPF to accelerate getting data in and out of L7 proxies such as Envoy, enabling efficient visibility into API protocols like HTTP, gRPC, and Kafka.

Hubble is a Network, Service & Security Observability for Kubernetes using eBPF.

Istio is an open platform to connect, manage, and secure microservices. Istio’s control plane provides an abstraction layer over the underlying cluster management platform, such as Kubernetes and Mesos.

Certgen is a convenience tool to generate and store certificates for Hubble Relay mTLS.

Scapy is a python-based interactive packet manipulation program & library.

syzkaller is an unsupervised, coverage-guided kernel fuzzer.

SchedViz is a tool for gathering and visualizing kernel scheduling traces on Linux machines.

oss-fuzz aims to make common open source software more secure and stable by combining modern fuzzing techniques with scalable, distributed execution.

OSSEC is a free, open-source host-based intrusion detection system. It performs log analysis, integrity checking, Windows registry monitoring, rootkit detection, time-based alerting, and active response.

Metasploit Project is a computer security project that provides information about security vulnerabilities and aids in penetration testing and IDS signature development.

Wfuzz was created to facilitate the task in web applications assessments and it is based on a simple concept: it replaces any reference to the FUZZ keyword by the value of a given payload.

Nmap is a security scanner used to discover hosts and services on a computer network, thus building a “map” of the network.

Patchwork is a web-based patch tracking system designed to facilitate the contribution and management of contributions to an open-source project.

pfSense is a free and open source firewall and router that also features unified threat management, load balancing, multi WAN, and more.

Snort is an open-source, free and lightweight network intrusion detection system (NIDS) software for Linux and Windows to detect emerging threats.

Wireshark is a free and open-source packet analyzer. It is used for network troubleshooting, analysis, software and communications protocol development, and education.

OpenSCAP is U.S. standard maintained by National Institute of Standards and Technology (NIST). It provides multiple tools to assist administrators and auditors with assessment, measurement, and enforcement of security baselines. OpenSCAP maintains great flexibility and interoperability by reducing the costs of performing security audits. Whether you want to evaluate DISA STIGs, NIST‘s USGCB, or Red Hat’s Security Response Team’s content, all are supported by OpenSCAP.

Tink is a multi-language, cross-platform, open source library that provides cryptographic APIs that are secure, easy to use correctly, and harder to misuse.

OWASP is an online community, produces freely-available articles, methodologies, documentation, tools, and technologies in the field of web application security.

Open Vulnerability and Assessment Language is a community effort to standardize how to assess and report upon the machine state of computer systems. OVAL includes a language to encode system details, and community repositories of content. Tools and services that use OVAL provide enterprises with accurate, consistent, and actionable information to improve their security.

Networking

netw

Network Learning Resources

AWS Certified Security – Specialty Certification

Microsoft Certified: Azure Security Engineer Associate

Google Cloud Certified Professional Cloud Security Engineer

Cisco Security Certifications

The Red Hat Certified Specialist in Security: Linux

Linux Professional Institute LPIC-3 Enterprise Security Certification

Cybersecurity Training and Courses from IBM Skills

Cybersecurity Courses and Certifications by Offensive Security

Citrix Certified Associate – Networking(CCA-N)

Citrix Certified Professional – Virtualization(CCP-V)

CCNP Routing and Switching

Certified Information Security Manager(CISM)

Wireshark Certified Network Analyst (WCNA)

Juniper Networks Certification Program Enterprise (JNCP)

Networking courses and specializations from Coursera

Network & Security Courses from Udemy

Network & Security Courses from edX

Networking Tools & Concepts

cURL is a computer software project providing a library and command-line tool for transferring data using various network protocols(HTTP, HTTPS, FTP, FTPS, SCP, SFTP, TFTP, DICT, TELNET, LDAP LDAPS, MQTT, POP3, POP3S, RTMP, RTMPS, RTSP, SCP, SFTP, SMB, SMBS, SMTP or SMTPS). cURL is also used in cars, television sets, routers, printers, audio equipment, mobile phones, tablets, settop boxes, media players and is the Internet transfer engine for thousands of software applications in over ten billion installations.

cURL Fuzzer is a quality assurance testing for the curl project.

DoH is a stand-alone application for DoH (DNS-over-HTTPS) name resolves and lookups.

HTTPie is a command-line HTTP client. Its goal is to make CLI interaction with web services as human-friendly as possible. HTTPie is designed for testing, debugging, and generally interacting with APIs & HTTP servers.

HTTPStat is a tool that visualizes curl statistics in a simple layout.

Wuzz is an interactive cli tool for HTTP inspection. It can be used to inspect/modify requests copied from the browser’s network inspector with the “copy as cURL” feature.

Websocat is a ommand-line client for WebSockets, like netcat (or curl) for ws:// with advanced socat-like functions.

• Connection: In networking, a connection refers to pieces of related information that are transferred through a network. This generally infers that a connection is built before the data transfer (by following the procedures laid out in a protocol) and then is deconstructed at the at the end of the data transfer.

• Packet: A packet is, generally speaking, the most basic unit that is transferred over a network. When communicating over a network, packets are the envelopes that carry your data (in pieces) from one end point to the other.

Packets have a header portion that contains information about the packet including the source and destination, timestamps, network hops. The main portion of a packet contains the actual data being transferred. It is sometimes called the body or the payload.

• Network Interface: A network interface can refer to any kind of software interface to networking hardware. For instance, if you have two network cards in your computer, you can control and configure each network interface associated with them individually.

A network interface may be associated with a physical device, or it may be a representation of a virtual interface. The “loop-back” device, which is a virtual interface to the local machine, is an example of this.

• LAN: LAN stands for "local area network". It refers to a network or a portion of a network that is not publicly accessible to the greater internet. A home or office network is an example of a LAN.

• WAN: WAN stands for "wide area network". It means a network that is much more extensive than a LAN. While WAN is the relevant term to use to describe large, dispersed networks in general, it is usually meant to mean the internet, as a whole.

If an interface is connected to the WAN, it is generally assumed that it is reachable through the internet.

• Protocol: A protocol is a set of rules and standards that basically define a language that devices can use to communicate. There are a great number of protocols in use extensively in networking, and they are often implemented in different layers.

Some low level protocols are TCP, UDP, IP, and ICMP. Some familiar examples of application layer protocols, built on these lower protocols, are HTTP (for accessing web content), SSH, TLS/SSL, and FTP.

• Port: A port is an address on a single machine that can be tied to a specific piece of software. It is not a physical interface or location, but it allows your server to be able to communicate using more than one application.

• Firewall: A firewall is a program that decides whether traffic coming into a server or going out should be allowed. A firewall usually works by creating rules for which type of traffic is acceptable on which ports. Generally, firewalls block ports that are not used by a specific application on a server.

• NAT: Network address translation is a way to translate requests that are incoming into a routing server to the relevant devices or servers that it knows about in the LAN. This is usually implemented in physical LANs as a way to route requests through one IP address to the necessary backend servers.

• VPN: Virtual private network is a means of connecting separate LANs through the internet, while maintaining privacy. This is used as a means of connecting remote systems as if they were on a local network, often for security reasons.

Network Layers

While networking is often discussed in terms of topology in a horizontal way, between hosts, its implementation is layered in a vertical fashion throughout a computer or network. This means is that there are multiple technologies and protocols that are built on top of each other in order for communication to function more easily. Each successive, higher layer abstracts the raw data a little bit more, and makes it simpler to use for applications and users. It also allows you to leverage lower layers in new ways without having to invest the time and energy to develop the protocols and applications that handle those types of traffic.

As data is sent out of one machine, it begins at the top of the stack and filters downwards. At the lowest level, actual transmission to another machine takes place. At this point, the data travels back up through the layers of the other computer. Each layer has the ability to add its own "wrapper" around the data that it receives from the adjacent layer, which will help the layers that come after decide what to do with the data when it is passed off.

One method of talking about the different layers of network communication is the OSI model. OSI stands for Open Systems Interconnect.This model defines seven separate layers. The layers in this model are:

• Application: The application layer is the layer that the users and user-applications most often interact with. Network communication is discussed in terms of availability of resources, partners to communicate with, and data synchronization.

• Presentation: The presentation layer is responsible for mapping resources and creating context. It is used to translate lower level networking data into data that applications expect to see.

• Session: The session layer is a connection handler. It creates, maintains, and destroys connections between nodes in a persistent way.

• Transport: The transport layer is responsible for handing the layers above it a reliable connection. In this context, reliable refers to the ability to verify that a piece of data was received intact at the other end of the connection. This layer can resend information that has been dropped or corrupted and can acknowledge the receipt of data to remote computers.

• Network: The network layer is used to route data between different nodes on the network. It uses addresses to be able to tell which computer to send information to. This layer can also break apart larger messages into smaller chunks to be reassembled on the opposite end.

• Data Link: This layer is implemented as a method of establishing and maintaining reliable links between different nodes or devices on a network using existing physical connections.

• Physical: The physical layer is responsible for handling the actual physical devices that are used to make a connection. This layer involves the bare software that manages physical connections as well as the hardware itself (like Ethernet).

The TCP/IP model, more commonly known as the Internet protocol suite, is another layering model that is simpler and has been widely adopted.It defines the four separate layers, some of which overlap with the OSI model:

• Application: In this model, the application layer is responsible for creating and transmitting user data between applications. The applications can be on remote systems, and should appear to operate as if locally to the end user.

The communication takes place between peers network.

• Transport: The transport layer is responsible for communication between processes. This level of networking utilizes ports to address different services. It can build up unreliable or reliable connections depending on the type of protocol used.

• Internet: The internet layer is used to transport data from node to node in a network. This layer is aware of the endpoints of the connections, but does not worry about the actual connection needed to get from one place to another. IP addresses are defined in this layer as a way of reaching remote systems in an addressable manner.

• Link: The link layer implements the actual topology of the local network that allows the internet layer to present an addressable interface. It establishes connections between neighboring nodes to send data.

Interfaces

Interfaces are networking communication points for your computer. Each interface is associated with a physical or virtual networking device. Typically, your server will have one configurable network interface for each Ethernet or wireless internet card you have. In addition, it will define a virtual network interface called the “loopback” or localhost interface. This is used as an interface to connect applications and processes on a single computer to other applications and processes. You can see this referenced as the “lo” interface in many tools.

Network Protocols

Networking works by piggybacks on a number of different protocols on top of each other. In this way, one piece of data can be transmitted using multiple protocols encapsulated within one another.

Media Access Control(MAC) is a communications protocol that is used to distinguish specific devices. Each device is supposed to get a unique MAC address during the manufacturing process that differentiates it from every other device on the internet. Addressing hardware by the MAC address allows you to reference a device by a unique value even when the software on top may change the name for that specific device during operation. Media access control is one of the only protocols from the link layer that you are likely to interact with on a regular basis.

The IP protocol is one of the fundamental protocols that allow the internet to work. IP addresses are unique on each network and they allow machines to address each other across a network. It is implemented on the internet layer in the IP/TCP model. Networks can be linked together, but traffic must be routed when crossing network boundaries. This protocol assumes an unreliable network and multiple paths to the same destination that it can dynamically change between. There are a number of different implementations of the protocol. The most common implementation today is IPv4, although IPv6 is growing in popularity as an alternative due to the scarcity of IPv4 addresses available and improvements in the protocols capabilities.

ICMP: internet control message protocol is used to send messages between devices to indicate the availability or error conditions. These packets are used in a variety of network diagnostic tools, such as ping and traceroute. Usually ICMP packets are transmitted when a packet of a different kind meets some kind of a problem. Basically, they are used as a feedback mechanism for network communications.

TCP: Transmission control protocol is implemented in the transport layer of the IP/TCP model and is used to establish reliable connections. TCP is one of the protocols that encapsulates data into packets. It then transfers these to the remote end of the connection using the methods available on the lower layers. On the other end, it can check for errors, request certain pieces to be resent, and reassemble the information into one logical piece to send to the application layer. The protocol builds up a connection prior to data transfer using a system called a three-way handshake. This is a way for the two ends of the communication to acknowledge the request and agree upon a method of ensuring data reliability. After the data has been sent, the connection is torn down using a similar four-way handshake. TCP is the protocol of choice for many of the most popular uses for the internet, including WWW, FTP, SSH, and email. It is safe to say that the internet we know today would not be here without TCP.

UDP: User datagram protocol is a popular companion protocol to TCP and is also implemented in the transport layer. The fundamental difference between UDP and TCP is that UDP offers unreliable data transfer. It does not verify that data has been received on the other end of the connection. This might sound like a bad thing, and for many purposes, it is. However, it is also extremely important for some functions. It’s not required to wait for confirmation that the data was received and forced to resend data, UDP is much faster than TCP. It does not establish a connection with the remote host, it simply fires off the data to that host and doesn’t care if it is accepted or not. Since UDP is a simple transaction, it is useful for simple communications like querying for network resources. It also doesn’t maintain a state, which makes it great for transmitting data from one machine to many real-time clients. This makes it ideal for VOIP, games, and other applications that cannot afford delays.

HTTP: Hypertext transfer protocol is a protocol defined in the application layer that forms the basis for communication on the web. HTTP defines a number of functions that tell the remote system what you are requesting. For instance, GET, POST, and DELETE all interact with the requested data in a different way.

FTP: File transfer protocol is in the application layer and provides a way of transferring complete files from one host to another. It is inherently insecure, so it is not recommended for any externally facing network unless it is implemented as a public, download-only resource.

DNS: Domain name system is an application layer protocol used to provide a human-friendly naming mechanism for internet resources. It is what ties a domain name to an IP address and allows you to access sites by name in your browser.

SSH: Secure shell is an encrypted protocol implemented in the application layer that can be used to communicate with a remote server in a secure way. Many additional technologies are built around this protocol because of its end-to-end encryption and ubiquity. There are many other protocols that we haven’t covered that are equally important. However, this should give you a good overview of some of the fundamental technologies that make the internet and networking possible.

JSON Web Token (JWT) is a compact URL-safe means of representing claims to be transferred between two parties. The claims in a JWT are encoded as a JSON object that is digitally signed using JSON Web Signature (JWS).

OAuth 2.0 is an open source authorization framework that enables applications to obtain limited access to user accounts on an HTTP service, such as Amazon, Google, Facebook, Microsoft, Twitter GitHub, and DigitalOcean. It works by delegating user authentication to the service that hosts the user account, and authorizing third-party applications to access the user account.