lock granularity : size of code protected by lock
coarse lock granularity : lock covers large cirtical section
too-coarse lock graunlarity poor performance : low concurrency
fine lock granularity : lock covers small ciritcal section
too-fine lock granularity poor performance : high overhead
distributed system: autonomous computers, physically dispersed, connected network,
appear as a single computer -> scailability(responisveness), fault tolerance(reliablity)
transparency(hide complexity)
drawback of naive approach
what: load imbalance between threads
why: array may not divide evenly, some task could be comlex and other could be simple
meaning of SEQUENTIAL_CUTOFF
what: threshold for sequential(direct) computing
why: reduces overhead for creating too many thread
what problem if SEQUENTIAL_CUTOFF too small
too many threads are created with high overhead
producer-consumer problem
producer and consumer share same-bounded-buffer
producer produces data and puts it into buffer
consumber removes data from buffer
producer goes to sleep if buffer is full,
then consumer notifies producer when it removes data from buffer
consumber goes to sleep if buffer is empty
then producer notifies consumer when it puts data into buffer
concurrency includes parallelism
thread safety can be obtained by avoiding race condition
synchronization in order to obtain correct runtime order and avoid race conditions
-> lock, semaphore, atomic variable,
volatile
always direct to/from main memory, not from thread's local cache
(ensures visibility not atomicity)
synchronization automatically prevents:
1) cacheing problem (incoherent cache data)
2) reordering problem (prvents compiler from reordering codes)
avoiding deadlock
acquire in increasing order
release in decreasing order
starvation: some threads get deferred(연기된) forever
concurrent hash map: hash table with full concurreny of retrivals and adjustable for updates
cache coherence: dicipline which ensures that change in shared data are propagated throughout the system
decomposition: coverage
assignment: granularity
orchestration/mapping: locality
partitioning = decomposition + assignment
UMA and NUMA are Shared Memory Architecture
- all processors to access all memory as global address space
shared memory architecture
- all processor access memory as global address space
- advantage: easy to program, fast data sharing
- disadvantage: lack of scailability, responsibility for syncronization
distributed memory architecture
- only local memory and independent (require network)
- advantage: scailability, cost effective
- disadvantage: responsibility for communication, no global memory access
overhead <--> cpu utilization, load balancing
vysryoo