[Multicore] Mid-term

lock granularity : size of code protected by lock
coarse lock granularity : lock covers large cirtical section
too-coarse lock graunlarity poor performance : low concurrency
fine lock granularity : lock covers small ciritcal section
too-fine lock granularity poor performance : high overhead

distributed system: autonomous computers, physically dispersed, connected network,
appear as a single computer -> scailability(responisveness), fault tolerance(reliablity)
transparency(hide complexity)

drawback of naive approach
what: load imbalance between threads
why: array may not divide evenly, some task could be comlex and other could be simple

meaning of SEQUENTIAL_CUTOFF
what: threshold for sequential(direct) computing
why: reduces overhead for creating too many thread

what problem if SEQUENTIAL_CUTOFF too small
too many threads are created with high overhead

producer-consumer problem
producer and consumer share same-bounded-buffer
producer produces data and puts it into buffer
consumber removes data from buffer
producer goes to sleep if buffer is full,
then consumer notifies producer when it removes data from buffer
consumber goes to sleep if buffer is empty
then producer notifies consumer when it puts data into buffer

concurrency includes parallelism
thread safety can be obtained by avoiding race condition
synchronization in order to obtain correct runtime order and avoid race conditions
-> lock, semaphore, atomic variable,
volatile
always direct to/from main memory, not from thread's local cache
(ensures visibility not atomicity)
synchronization automatically prevents:
1) cacheing problem (incoherent cache data)
2) reordering problem (prvents compiler from reordering codes)

avoiding deadlock
acquire in increasing order
release in decreasing order

starvation: some threads get deferred(연기된) forever
concurrent hash map: hash table with full concurreny of retrivals and adjustable for updates

cache coherence: dicipline which ensures that change in shared data are propagated throughout the system

decomposition: coverage
assignment: granularity
orchestration/mapping: locality

partitioning = decomposition + assignment

UMA and NUMA are Shared Memory Architecture
- all processors to access all memory as global address space

shared memory architecture
- all processor access memory as global address space
- advantage: easy to program, fast data sharing
- disadvantage: lack of scailability, responsibility for syncronization

distributed memory architecture
- only local memory and independent (require network)
- advantage: scailability, cost effective
- disadvantage: responsibility for communication, no global memory access

overhead <--> cpu utilization, load balancing

티스토리툴바