Browsers are difficult Please wait, loading your map...
first to solve itSystem calls and process communicate as assemblyif process in user space we can manipulate it withhard links are limited-Operating SystemsMostly based on xv6-How does the OSengage with external(to the CPU) devices?How does the OS use(persistent) Storage?file abstractionfile systemimplementationOverall Organizationpg.464Directory Organizationfile organization in memory(on-disk organization of the data struc- turesof the vsfs file system) pg. 463Innodesallocation structuressuperblock4kb blockshow to assemble a fulldirectory tree ?first making filesystemsmounting it to makevisiblemount()pg.456mkfspg. 456LinksHard Linksentry in the file systemtree, through a systemcall known aslink() pg.452soft linksFree SpaceManagementbitmapspg.470how to assemble a fulldirectory tree frommany underlying filesystems?mkfs pg.456File System Interfacepg.443filesHow do we create files?int fd = open("foo", O_CREAT | O_WRONLY | O_TRUNC);pg. 443returns: a file descriptoronce you have such anobject, you can callother “methods” toaccess the file, likeread() and  write()create() pg.443How do we read writefiles?read()write()Reading And Writing,But Not Sequentiallylseek() system call pg.446Writing Immediatelyfsync()Renaming Files pg. 448Removing Filesunlink() pg.450why unlink? pg 452or see link NodedirMaking Directoriesmkdir()39.11 ReadingDirectoriesopendir() pg.451readdir()closedir()deleting directoriesrmdir() pg. 452The Crash ConsistencyProblemHow do we avoidruining files duringcrash?Solution #1: The FileSystem CheckerFSCKpg. 495problemstoo slowcan't fix case where thefile system looksconsistent but theinode points to garbagedataSecurity issues: a blockcould migrate from thepassword file to someother random file.Solution #2: Journalingpg. 491(based on write-aheadlogging)Step1: Journal writepg.501Step2: Journal commit:Step3: CheckpointStep4: Freepg. 503How does JournalingRecover?if crash before step2easy: the pendingupdate is simplyskippedpg.501the crash happens afterStep2 but before Step3redo loggingpg.501any point during check-pointingno problemproblemswe are writing eachdata block to the disktwice,metadata Journalingpg.504add new step between 1and 2 to write metadatathis is most popularappraochOther solutions?copy-on-writebackpointer-basedconsistencyoptimistic crashconsistencyexternal storageHard Diskmagnetic tapesflash storageFlash-based SSDsFlash driveshow do we evaluateexternal storage drives?I/O Timepg.408transfer timerotation timeseek time“AVERAGE” SEEK timepg.411cost and otherengineering factorsSSD vs Hard drive indepth comparisonHOW TO MAKE A LARGE,FAST, RELIABLE DISK?Redundant Array ofInexpensive Disks(RAID) pg. 421How is the addressspace of a modern diskorganized?drive consists of a largenumber of sectors (512-byte blocks), each ofwhich can be read orwritten.how to communicatewith softwere?Persistent devices : I/oHOW TO BUILD I/ODEVICE-NEUTRAL OShard disk drivers pg.403HOW TO STORE ANDACCESS DATA ON DISK?A Simple Disk Drivepg.404Disk Scheduling pg.412understand diskperformance pg 409Reading A File FromDiskWriting to Disk pg.472(software) device driverpg.396IO BUSPCIHOW TO COMMUNICATEWITH DEVICESI/O instructionspg.395Memory mapped I/Opg.395prerifial IOKeyboardUSBmiceHow do we know whenasynchronous I/Ocompletes?Pollingask the device every timeHOW TO AVOID THECOSTS OF POLLING?Interruptspg392but interrupts is notalways betterMaskable interruptsNonmaskableinterruptsHOW TO LOWER PIOOVERHEADStransfer a large chunkof data to a device iswasted CPU timeDirect Memory Access (DMA)A DMA engine is essentially avery specific device within asystem that can orchestratetransfers between devicesand main memory withoutmuch CPU intervention.pg.394How does the OS usethe CPU in xV6?The OS has the illusionof many many CPUs butin hardware we onlyhave few CPUhow do we interact withthe many many virtualCPUs?The (Linux) Kernel APIAccessing hardware resourceskernel internal APIKernel I/O SubsystemHow does the processorgive commands anddata to a controller toaccomplish an I/Otransfer?I/O instructions - portmapped I/Odata-inregisterdata-outregisterstatus registercontrol registerdevice-control registersare mapped into theaddress space of theprocessor.Memory-Mapped I/Olarge data trasnfer?Direct Memory Access(DMA)direct virtual memoryaccess (DVMA)Nonblocking andAsynchronous I/Oevery time we want to dosomething with our virtualmemory and CPU. We usethe kernel APIWe create a processHow does the linuxkernel communicatewith a process?UNIX SIGNALShow do we isolatedifferent processesfrom one another?Process Memory LayoutNamespacesmnt (mount points, fiesystems)namespaceprocess ID (PID)namespacenet (netwoek stack)namespaceinterprocesscommunication(System V IPC)namespaceA UNIX Time‑Sharing(UTS)  (hostname)name space systemcallscreateclone()new process andnamespaceunsure()creates new namespaceterminationexit()fork()join existing namespacesets()usernamespacekernel-userspace APISystem InterphasePOSIX API for POSIX-based systems(including virtuallyallversions of UNIX,Linux, and Mac OS X)the C standard librarystandard wrapper to accesssystems interphaseThe Process APIwrapperwait()fork()excec()excec()(UNIX) shellInteracting with hecomputer writing a Cprogram(shell is just a userprogram)Other system calls notpart of standard libraryProcess controlFile managementDevice managementInformationmaintenanceCommunicationProtectionthreadsthread creationpthread create() pg.280int pthread_joinThread Completionpthread join()Thread lcoksint pthread_mutex_lock(pthread_mutex_t *mutex);int pthread_mutex_unlock(pthread_mutex_t *mutex);pg.285HOW TO PROVIDESUPPORT FORSYNCHRONIZATIONwith the virtual manymany CPUs?Thread APIconceptsmultithreadingconcurrencyAsynchronyLocka lock or mutex (frommutual exclusion) is asynchronizationprimitive: a mechanismthat enforces limits onaccess to a resourcewhen there are manythreads of execution.HOW TO BUILD A LOCKcriterialControlling InterruptsTest And Set (AtomicExchange)How does the OS usememory in xV6?Memory abstraction onthe OShow do we organizememory for eachprocess?Every process has astack separated into 2partsThe kernel space(TOP OF STACK)which is the locationwhere the code of thekernel is stored, andexecutes under.The user spaceset of locations wherenormal user processesrun (i.e everythingother than the kernel).The role of the kernel isto manage applicationsrunning in this spacefrom messing with eachother, and the machine.How do we implementMultilevel page tablesin xv6mmap()malloc, freeHow do prevent corruptmemory duringcrashes?shadow pagingsimilar to journalingsee journaling nodeHow do we write codefor each process?base and boundssegmentation(generalized base andbounds)paging(TLB) Multilevel page tableTBLa process will neveraccidentally encounterthe wrong trans- lationsin the TLBHOW TO MANAGE TLBCONTENTS ON ACONTEXT SWITCHflush the TLB on contextswitchespg.191we are installing a newentry in the TLB, wehave to replace an oldone, and thus thequestion: which one toreplace?HOW TO DESIGN TLBREPLACEMENT POLICYleast-recently-usedpg. 192the page data structureproblemsThe Hardwarethe CPUHeterogeneousProcessorsHow do we buildheterogeneouscomputer?HeliouA kinda MultikernelLimitarionslimited set ofapplications. Difficult toimplement satellitekernelsneed new compilersupport fornewplatformswhat does it provide?Simplify appdevelopment,deployment, and tuningProvide singleprogramming model forheterogeneous systemsHow does it work?Satellite kernels: SameOS abstractioneverywhereRemote messagepassing: TransparentIPC between kernelsAffinity Metrics: Easilyexpress arbitraryplacement policies toOSpositive affinityprocesses should becolocated in that stacknegative affinityprocesses should beon different kernelsself-reference affinityrepresents a copies ofthe process measure2-phase compilation:Run apps on arbitrarydevicespriority algorithmmakes decisionsbuilt on Singularity OSsingle address space!Same ISA but differentextensions or micro-architecture: ARMbig.LITTLE, Xeon Phi,Intel Sunny CoveRe-configurable FPGAsDifferent ISA on samechip: AMD integrating $\times 86$ and ARMAccelerators such asGPUs and TPUsRAMassembly code conceptsassembly code -compilersyou still neeed toconvert assembly codeto machine codeHow do we actually useregisters in moderncomputers?Compiler takes care ofregisters for youx86-64 registersvirtual machineI want the benefits ofVM but its too large andslow!containersnot full VM but kindaimplementatinDockerHow do we specifyinstructions for set up?Dockerfilewhere do I find imagesready to go for specificapplications ?docket hub of usefuldozer imagesHow does it work?kernel namespacesA lightweight way tovirtualize a processsee namespace nodeCGrounpscontrol groupswhat does CGrouopsprovide?Resource limitsAccountingControlPrioritizationHow is CGroupsimplementedfew kernel additionsnone critically impactperformanceA new file system oftype "cgroup" (VFS)Systernwide: /proc/cgroupsFor each process: /proc/pid/cgroupUnionFSwhat does UnionFSprovide?several containers canshare common dataWrites to one containerdoes not affect anotherOn write the UnionFSthe overwrite data issaved to a new pathspecific to containerMange multiplecontainersDeploy containers inclusterkubbernteswhat are the benefits ofKubberners?Service discovery andload balancingStorage orchestrationKubernetesAutomated rollouts androllbacksAutomatic bin packingSelf-healingSecret andconfigurationmanagementDocker SwarmVM vs ContainerswhyMULTIPLEXING ANDEMULATIONPopek and Goldberg formalized the relationshipbetween a virtual machine and hypervisor(which they call VMM)Virtualization inComputer Architecture(emphasis is onresource allocation)Bare-metal Hypervisor(type-1)How do we handle I/ODirect accessa virtual machine withdedicated physical I/Odevice can access entirephysical memory usingDMA operations. Thisvulnerability issue canbe protected by IOMMUIOMMUSR-IOV (Single Root Input OutputVirtualization)widely used today toachieve low latencynetworking.DPDKXentrap and emmulateparavirtualizationMicrosoft Hyper-Vx86 was notvirtualizableIntel® VirtualizationTechnology (VT-x)what hardwaremodifications arerequired?root modeVMwareESX Serverhow do we handlememoryVirtualization withinOperating Systems(emphasis is onresource allocation)Hosted Hypervisor(type-2)how do we handlememoryVirtual Box *host OS hasno idea about VMM it’sjust another applicationQEMU/LinuxKVM(kernel virtualizationmodele) --  full systemsimulatorBinary transaltionVMM workstation(Year 2000)binary translationHow do we handle I/Ointerpositionparavirtualzationx86 not virtualizable byPopek and GoldbergconditionsOS ArchitecturesUnikernelimplementatinoproblemHow should westructure an OS forfuture multicoresystems?Solutionstructure the OS as adistributed systemMultikernelimplementatinoBarrelfish projectsupport x86-64multiprocessorwill support ARM soonopen sourcedproblemHow should westructure an OS forfuture multicoresystems?scalability to manycorescurrent day core inteconnectivity restricts toneighboring core comunicatinoheterogeneity andhardware diversitywe have specializedchips for specializedfuntion. But these dontcommunicate so wellSolutionstructure the OS as adistributed systemexplicit inter-corecommuncationdecouple systemstructure from inter-core communicationnaturally supportsheterogenous cores,non-coherentinterconnects (PCLe)Intel 80-coresee Hardware:heterogeneous nodeall communication withmesseges (no sharedstate)Tile64make OS structurehardware-neutralview state as replicatednaturally supportsdomains with no sharedmemorynaturally supportschanges to runningcoresExokernelimplementatinoproblemmonolithic kernel isgeneral porppose andnot optimized forapplicationsmonolithic kernel isruns device drivers atsame privilege as restof OSdevice drivers writtenby third party and arebuggybug in device driverbrings down entirekernelSolutionapplications to managephysical resources.Separate policy frommechanism. Kernel onlyprovides safety andmechanism to safelymanage resourcesexpose allocationensure protectionsecure bindingsexpose namestrack ownershipreosourcesexpose revocationrevoking access toresourcesabort protocolsvisible resourcerevocationhow does theapplication manageresources?packet filters written byapplication usingprimitivesMonolithic Kernel

Created using MindMup.com