README.CV -- Condition Variables | |
-------------------------------- | |
The original implementation of condition variables in | |
pthreads-win32 was based on a discussion paper: | |
"Strategies for Implementing POSIX Condition Variables | |
on Win32": http://www.cs.wustl.edu/~schmidt/win32-cv-1.html | |
The changes suggested below were made on Feb 6 2001. This | |
file is included in the package for the benefit of anyone | |
interested in understanding the pthreads-win32 implementation | |
of condition variables and the (sometimes subtle) issues that | |
it attempts to resolve. | |
Thanks go to the individuals whose names appear throughout | |
the following text. | |
Ross Johnson | |
-------------------------------------------------------------------- | |
fyi.. (more detailed problem description/demos + possible fix/patch) | |
regards, | |
alexander. | |
Alexander Terekhov | |
31.01.2001 17:43 | |
To: ace-bugs@cs.wustl.edu | |
cc: | |
From: Alexander Terekhov/Germany/IBM@IBMDE | |
Subject: Implementation of POSIX CVs: spur.wakeups/lost | |
signals/deadlocks/unfairness | |
ACE VERSION: | |
5.1.12 (pthread-win32 snapshot 2000-12-29) | |
HOST MACHINE and OPERATING SYSTEM: | |
IBM IntelliStation Z Pro, 2 x XEON 1GHz, Win2K | |
TARGET MACHINE and OPERATING SYSTEM, if different from HOST: | |
COMPILER NAME AND VERSION (AND PATCHLEVEL): | |
Microsoft Visual C++ 6.0 | |
AREA/CLASS/EXAMPLE AFFECTED: | |
Implementation of POSIX condition variables - OS.cpp/.h | |
DOES THE PROBLEM AFFECT: | |
EXECUTION? YES! | |
SYNOPSIS: | |
a) spurious wakeups (minor problem) | |
b) lost signals | |
c) broadcast deadlock | |
d) unfairness (minor problem) | |
DESCRIPTION: | |
Please see attached copy of discussion thread | |
from comp.programming.threads for more details on | |
some reported problems. (i've also posted a "fyi" | |
message to ace-users a week or two ago but | |
unfortunately did not get any response so far). | |
It seems that current implementation suffers from | |
two essential problems: | |
1) cond.waiters_count does not accurately reflect | |
number of waiters blocked on semaphore - w/o | |
proper synchronisation that could result (in the | |
time window when counter is not accurate) | |
in spurious wakeups organised by subsequent | |
_signals and _broadcasts. | |
2) Always having (with no e.g. copy_and_clear/..) | |
the same queue in use (semaphore+counter) | |
neither signal nor broadcast provide 'atomic' | |
behaviour with respect to other threads/subsequent | |
calls to signal/broadcast/wait. | |
Each problem and combination of both could produce | |
various nasty things: | |
a) spurious wakeups (minor problem) | |
it is possible that waiter(s) which was already | |
unblocked even so is still counted as blocked | |
waiter. signal and broadcast will release | |
semaphore which will produce a spurious wakeup | |
for a 'real' waiter coming later. | |
b) lost signals | |
signalling thread ends up consuming its own | |
signal. please see demo/discussion below. | |
c) broadcast deadlock | |
last_waiter processing code does not correctly | |
handle the case with multiple threads | |
waiting for the end of broadcast. | |
please see demo/discussion below. | |
d) unfairness (minor problem) | |
without SignalObjectAndWait some waiter(s) | |
may end up consuming broadcasted signals | |
multiple times (spurious wakeups) because waiter | |
thread(s) can be preempted before they call | |
semaphore wait (but after count++ and mtx.unlock). | |
REPEAT BY: | |
See below... run problem demos programs (tennis.cpp and | |
tennisb.cpp) number of times concurrently (on multiprocessor) | |
and in multiple sessions or just add a couple of "Sleep"s | |
as described in the attached copy of discussion thread | |
from comp.programming.threads | |
SAMPLE FIX/WORKAROUND: | |
See attached patch to pthread-win32.. well, I can not | |
claim that it is completely bug free but at least my | |
test and tests provided by pthreads-win32 seem to work. | |
Perhaps that will help. | |
regards, | |
alexander. | |
>> Forum: comp.programming.threads | |
>> Thread: pthread_cond_* implementation questions | |
. | |
. | |
. | |
David Schwartz <davids@webmaster.com> wrote: | |
> terekhov@my-deja.com wrote: | |
> | |
>> BTW, could you please also share your view on other perceived | |
>> "problems" such as nested broadcast deadlock, spurious wakeups | |
>> and (the latest one) lost signals?? | |
> | |
>I'm not sure what you mean. The standard allows an implementation | |
>to do almost whatever it likes. In fact, you could implement | |
>pthread_cond_wait by releasing the mutex, sleeping a random | |
>amount of time, and then reacquiring the mutex. Of course, | |
>this would be a pretty poor implementation, but any code that | |
>didn't work under that implementation wouldn't be strictly | |
>compliant. | |
The implementation you suggested is indeed correct | |
one (yes, now I see it :). However it requires from | |
signal/broadcast nothing more than to "{ return 0; }" | |
That is not the case for pthread-win32 and ACE | |
implementations. I do think that these implementations | |
(basically the same implementation) have some serious | |
problems with wait/signal/broadcast calls. I am looking | |
for help to clarify whether these problems are real | |
or not. I think that I can demonstrate what I mean | |
using one or two small sample programs. | |
. | |
. | |
. | |
========== | |
tennis.cpp | |
========== | |
#include "ace/Synch.h" | |
#include "ace/Thread.h" | |
enum GAME_STATE { | |
START_GAME, | |
PLAYER_A, // Player A playes the ball | |
PLAYER_B, // Player B playes the ball | |
GAME_OVER, | |
ONE_PLAYER_GONE, | |
BOTH_PLAYERS_GONE | |
}; | |
enum GAME_STATE eGameState; | |
ACE_Mutex* pmtxGameStateLock; | |
ACE_Condition< ACE_Mutex >* pcndGameStateChange; | |
void* | |
playerA( | |
void* pParm | |
) | |
{ | |
// For access to game state variable | |
pmtxGameStateLock->acquire(); | |
// Play loop | |
while ( eGameState < GAME_OVER ) { | |
// Play the ball | |
cout << endl << "PLAYER-A" << endl; | |
// Now its PLAYER-B's turn | |
eGameState = PLAYER_B; | |
// Signal to PLAYER-B that now it is his turn | |
pcndGameStateChange->signal(); | |
// Wait until PLAYER-B finishes playing the ball | |
do { | |
pcndGameStateChange->wait(); | |
if ( PLAYER_B == eGameState ) | |
cout << endl << "----PLAYER-A: SPURIOUS WAKEUP!!!" << endl; | |
} while ( PLAYER_B == eGameState ); | |
} | |
// PLAYER-A gone | |
eGameState = (GAME_STATE)(eGameState+1); | |
cout << endl << "PLAYER-A GONE" << endl; | |
// No more access to state variable needed | |
pmtxGameStateLock->release(); | |
// Signal PLAYER-A gone event | |
pcndGameStateChange->broadcast(); | |
return 0; | |
} | |
void* | |
playerB( | |
void* pParm | |
) | |
{ | |
// For access to game state variable | |
pmtxGameStateLock->acquire(); | |
// Play loop | |
while ( eGameState < GAME_OVER ) { | |
// Play the ball | |
cout << endl << "PLAYER-B" << endl; | |
// Now its PLAYER-A's turn | |
eGameState = PLAYER_A; | |
// Signal to PLAYER-A that now it is his turn | |
pcndGameStateChange->signal(); | |
// Wait until PLAYER-A finishes playing the ball | |
do { | |
pcndGameStateChange->wait(); | |
if ( PLAYER_A == eGameState ) | |
cout << endl << "----PLAYER-B: SPURIOUS WAKEUP!!!" << endl; | |
} while ( PLAYER_A == eGameState ); | |
} | |
// PLAYER-B gone | |
eGameState = (GAME_STATE)(eGameState+1); | |
cout << endl << "PLAYER-B GONE" << endl; | |
// No more access to state variable needed | |
pmtxGameStateLock->release(); | |
// Signal PLAYER-B gone event | |
pcndGameStateChange->broadcast(); | |
return 0; | |
} | |
int | |
main (int, ACE_TCHAR *[]) | |
{ | |
pmtxGameStateLock = new ACE_Mutex(); | |
pcndGameStateChange = new ACE_Condition< ACE_Mutex >( *pmtxGameStateLock | |
); | |
// Set initial state | |
eGameState = START_GAME; | |
// Create players | |
ACE_Thread::spawn( playerA ); | |
ACE_Thread::spawn( playerB ); | |
// Give them 5 sec. to play | |
Sleep( 5000 );//sleep( 5 ); | |
// Set game over state | |
pmtxGameStateLock->acquire(); | |
eGameState = GAME_OVER; | |
// Let them know | |
pcndGameStateChange->broadcast(); | |
// Wait for players to stop | |
do { | |
pcndGameStateChange->wait(); | |
} while ( eGameState < BOTH_PLAYERS_GONE ); | |
// Cleanup | |
cout << endl << "GAME OVER" << endl; | |
pmtxGameStateLock->release(); | |
delete pcndGameStateChange; | |
delete pmtxGameStateLock; | |
return 0; | |
} | |
=========== | |
tennisb.cpp | |
=========== | |
#include "ace/Synch.h" | |
#include "ace/Thread.h" | |
enum GAME_STATE { | |
START_GAME, | |
PLAYER_A, // Player A playes the ball | |
PLAYER_B, // Player B playes the ball | |
GAME_OVER, | |
ONE_PLAYER_GONE, | |
BOTH_PLAYERS_GONE | |
}; | |
enum GAME_STATE eGameState; | |
ACE_Mutex* pmtxGameStateLock; | |
ACE_Condition< ACE_Mutex >* pcndGameStateChange; | |
void* | |
playerA( | |
void* pParm | |
) | |
{ | |
// For access to game state variable | |
pmtxGameStateLock->acquire(); | |
// Play loop | |
while ( eGameState < GAME_OVER ) { | |
// Play the ball | |
cout << endl << "PLAYER-A" << endl; | |
// Now its PLAYER-B's turn | |
eGameState = PLAYER_B; | |
// Signal to PLAYER-B that now it is his turn | |
pcndGameStateChange->broadcast(); | |
// Wait until PLAYER-B finishes playing the ball | |
do { | |
pcndGameStateChange->wait(); | |
if ( PLAYER_B == eGameState ) | |
cout << endl << "----PLAYER-A: SPURIOUS WAKEUP!!!" << endl; | |
} while ( PLAYER_B == eGameState ); | |
} | |
// PLAYER-A gone | |
eGameState = (GAME_STATE)(eGameState+1); | |
cout << endl << "PLAYER-A GONE" << endl; | |
// No more access to state variable needed | |
pmtxGameStateLock->release(); | |
// Signal PLAYER-A gone event | |
pcndGameStateChange->broadcast(); | |
return 0; | |
} | |
void* | |
playerB( | |
void* pParm | |
) | |
{ | |
// For access to game state variable | |
pmtxGameStateLock->acquire(); | |
// Play loop | |
while ( eGameState < GAME_OVER ) { | |
// Play the ball | |
cout << endl << "PLAYER-B" << endl; | |
// Now its PLAYER-A's turn | |
eGameState = PLAYER_A; | |
// Signal to PLAYER-A that now it is his turn | |
pcndGameStateChange->broadcast(); | |
// Wait until PLAYER-A finishes playing the ball | |
do { | |
pcndGameStateChange->wait(); | |
if ( PLAYER_A == eGameState ) | |
cout << endl << "----PLAYER-B: SPURIOUS WAKEUP!!!" << endl; | |
} while ( PLAYER_A == eGameState ); | |
} | |
// PLAYER-B gone | |
eGameState = (GAME_STATE)(eGameState+1); | |
cout << endl << "PLAYER-B GONE" << endl; | |
// No more access to state variable needed | |
pmtxGameStateLock->release(); | |
// Signal PLAYER-B gone event | |
pcndGameStateChange->broadcast(); | |
return 0; | |
} | |
int | |
main (int, ACE_TCHAR *[]) | |
{ | |
pmtxGameStateLock = new ACE_Mutex(); | |
pcndGameStateChange = new ACE_Condition< ACE_Mutex >( *pmtxGameStateLock | |
); | |
// Set initial state | |
eGameState = START_GAME; | |
// Create players | |
ACE_Thread::spawn( playerA ); | |
ACE_Thread::spawn( playerB ); | |
// Give them 5 sec. to play | |
Sleep( 5000 );//sleep( 5 ); | |
// Make some noise | |
pmtxGameStateLock->acquire(); | |
cout << endl << "---Noise ON..." << endl; | |
pmtxGameStateLock->release(); | |
for ( int i = 0; i < 100000; i++ ) | |
pcndGameStateChange->broadcast(); | |
cout << endl << "---Noise OFF" << endl; | |
// Set game over state | |
pmtxGameStateLock->acquire(); | |
eGameState = GAME_OVER; | |
cout << endl << "---Stopping the game..." << endl; | |
// Let them know | |
pcndGameStateChange->broadcast(); | |
// Wait for players to stop | |
do { | |
pcndGameStateChange->wait(); | |
} while ( eGameState < BOTH_PLAYERS_GONE ); | |
// Cleanup | |
cout << endl << "GAME OVER" << endl; | |
pmtxGameStateLock->release(); | |
delete pcndGameStateChange; | |
delete pmtxGameStateLock; | |
return 0; | |
} | |
. | |
. | |
. | |
David Schwartz <davids@webmaster.com> wrote: | |
>> > It's compliant | |
>> | |
>> That is really good. | |
> | |
>> Tomorrow (I have to go urgently now) I will try to | |
>> demonstrate the lost-signal "problem" of current | |
>> pthread-win32 and ACE-(variant w/o SingleObjectAndWait) | |
>> implementations: players start suddenly drop their balls :-) | |
>> (with no change in source code). | |
> | |
>Signals aren't lost, they're going to the main thread, | |
>which isn't coded correctly to handle them. Try this: | |
> | |
> // Wait for players to stop | |
> do { | |
> | |
> pthread_cond_wait( &cndGameStateChange,&mtxGameStateLock ); | |
>printf("Main thread stole a signal\n"); | |
> | |
> } while ( eGameState < BOTH_PLAYERS_GONE ); | |
> | |
>I bet everytime you thing a signal is lost, you'll see that printf. | |
>The signal isn't lost, it was stolen by another thread. | |
well, you can probably loose your bet.. it was indeed stolen | |
by "another" thread but not the one you seem to think of. | |
I think that what actually happens is the following: | |
H:\SA\UXX\pt\PTHREADS\TESTS>tennis3.exe | |
PLAYER-A | |
PLAYER-B | |
----PLAYER-B: SPURIOUS WAKEUP!!! | |
PLAYER-A GONE | |
PLAYER-B GONE | |
GAME OVER | |
H:\SA\UXX\pt\PTHREADS\TESTS> | |
here you can see that PLAYER-B after playing his first | |
ball (which came via signal from PLAYER-A) just dropped | |
it down. What happened is that his signal to player A | |
was consumed as spurious wakeup by himself (player B). | |
The implementation has a problem: | |
================ | |
waiting threads: | |
================ | |
{ /** Critical Section | |
inc cond.waiters_count | |
} | |
/* | |
/* Atomic only if using Win32 SignalObjectAndWait | |
/* | |
cond.mtx.release | |
/*** ^^-- A THREAD WHICH DID SIGNAL MAY ACQUIRE THE MUTEX, | |
/*** GO INTO WAIT ON THE SAME CONDITION AND OVERTAKE | |
/*** ORIGINAL WAITER(S) CONSUMING ITS OWN SIGNAL! | |
cond.sem.wait | |
Player-A after playing game's initial ball went into | |
wait (called _wait) but was pre-empted before reaching | |
wait semaphore. He was counted as waiter but was not | |
actually waiting/blocked yet. | |
=============== | |
signal threads: | |
=============== | |
{ /** Critical Section | |
waiters_count = cond.waiters_count | |
} | |
if ( waiters_count != 0 ) | |
sem.post 1 | |
endif | |
Player-B after he received signal/ball from Player A | |
called _signal. The _signal did see that there was | |
one waiter blocked on the condition (Player-A) and | |
released the semaphore.. (but it did not unblock | |
Player-A because he was not actually blocked). | |
Player-B thread continued its execution, called _wait, | |
was counted as second waiter BUT was allowed to slip | |
through opened semaphore gate (which was opened for | |
Player-B) and received his own signal. Player B remained | |
blocked followed by Player A. Deadlock happened which | |
lasted until main thread came in and said game over. | |
It seems to me that the implementation fails to | |
correctly implement the following statement | |
from specification: | |
http://www.opengroup.org/ | |
onlinepubs/007908799/xsh/pthread_cond_wait.html | |
"These functions atomically release mutex and cause | |
the calling thread to block on the condition variable | |
cond; atomically here means "atomically with respect | |
to access by another thread to the mutex and then the | |
condition variable". That is, if another thread is | |
able to acquire the mutex after the about-to-block | |
thread has released it, then a subsequent call to | |
pthread_cond_signal() or pthread_cond_broadcast() | |
in that thread behaves as if it were issued after | |
the about-to-block thread has blocked." | |
Question: Am I right? | |
(I produced the program output above by simply | |
adding ?Sleep( 1 )?: | |
================ | |
waiting threads: | |
================ | |
{ /** Critical Section | |
inc cond.waiters_count | |
} | |
/* | |
/* Atomic only if using Win32 SignalObjectAndWait | |
/* | |
cond.mtx.release | |
Sleep( 1 ); // Win32 | |
/*** ^^-- A THREAD WHICH DID SIGNAL MAY ACQUIRE THE MUTEX, | |
/*** GO INTO WAIT ON THE SAME CONDITION AND OVERTAKE | |
/*** ORIGINAL WAITER(S) CONSUMING ITS OWN SIGNAL! | |
cond.sem.wait | |
to the source code of pthread-win32 implementation: | |
http://sources.redhat.com/cgi-bin/cvsweb.cgi/pthreads/ | |
condvar.c?rev=1.36&content-type=text/ | |
x-cvsweb-markup&cvsroot=pthreads-win32 | |
/* | |
* We keep the lock held just long enough to increment the count of | |
* waiters by one (above). | |
* Note that we can't keep it held across the | |
* call to sem_wait since that will deadlock other calls | |
* to pthread_cond_signal | |
*/ | |
cleanup_args.mutexPtr = mutex; | |
cleanup_args.cv = cv; | |
cleanup_args.resultPtr = &result; | |
pthread_cleanup_push (ptw32_cond_wait_cleanup, (void *) | |
&cleanup_args); | |
if ((result = pthread_mutex_unlock (mutex)) == 0) | |
{((result | |
Sleep( 1 ); // @AT | |
/* | |
* Wait to be awakened by | |
* pthread_cond_signal, or | |
* pthread_cond_broadcast, or | |
* a timeout | |
* | |
* Note: | |
* ptw32_sem_timedwait is a cancelation point, | |
* hence providing the | |
* mechanism for making pthread_cond_wait a cancelation | |
* point. We use the cleanup mechanism to ensure we | |
* re-lock the mutex and decrement the waiters count | |
* if we are canceled. | |
*/ | |
if (ptw32_sem_timedwait (&(cv->sema), abstime) == -1) { | |
result = errno; | |
} | |
} | |
pthread_cleanup_pop (1); /* Always cleanup */ | |
BTW, on my system (2 CPUs) I can manage to get | |
signals lost even without any source code modification | |
if I run the tennis program many times in different | |
shell sessions. | |
. | |
. | |
. | |
David Schwartz <davids@webmaster.com> wrote: | |
>terekhov@my-deja.com wrote: | |
> | |
>> well, it might be that the program is in fact buggy. | |
>> but you did not show me any bug. | |
> | |
>You're right. I was close but not dead on. I was correct, however, | |
>that the code is buggy because it uses 'pthread_cond_signal' even | |
>though not any thread waiting on the condition variable can do the | |
>job. I was wrong in which thread could be waiting on the cv but | |
>unable to do the job. | |
Okay, lets change 'pthread_cond_signal' to 'pthread_cond_broadcast' | |
but also add some noise from main() right before declaring the game | |
to be over (I need it in order to demonstrate another problem of | |
pthread-win32/ACE implementations - broadcast deadlock)... | |
. | |
. | |
. | |
It is my understanding of POSIX conditions, | |
that on correct implementation added noise | |
in form of unnecessary broadcasts from main, | |
should not break the tennis program. The | |
only 'side effect' of added noise on correct | |
implementation would be 'spurious wakeups' of | |
players (in fact they are not spurious, | |
players just see them as spurious) unblocked, | |
not by another player but by main before | |
another player had a chance to acquire the | |
mutex and change the game state variable: | |
. | |
. | |
. | |
PLAYER-B | |
PLAYER-A | |
---Noise ON... | |
PLAYER-B | |
PLAYER-A | |
. | |
. | |
. | |
PLAYER-B | |
PLAYER-A | |
----PLAYER-A: SPURIOUS WAKEUP!!! | |
PLAYER-B | |
PLAYER-A | |
---Noise OFF | |
PLAYER-B | |
---Stopping the game... | |
PLAYER-A GONE | |
PLAYER-B GONE | |
GAME OVER | |
H:\SA\UXX\pt\PTHREADS\TESTS> | |
On pthread-win32/ACE implementations the | |
program could stall: | |
. | |
. | |
. | |
PLAYER-A | |
PLAYER-B | |
PLAYER-A | |
PLAYER-B | |
PLAYER-A | |
PLAYER-B | |
PLAYER-A | |
PLAYER-B | |
---Noise ON... | |
PLAYER-A | |
---Noise OFF | |
^C | |
H:\SA\UXX\pt\PTHREADS\TESTS> | |
The implementation has problems: | |
================ | |
waiting threads: | |
================ | |
{ /** Critical Section | |
inc cond.waiters_count | |
} | |
/* | |
/* Atomic only if using Win32 SignalObjectAndWait | |
/* | |
cond.mtx.release | |
cond.sem.wait | |
/*** ^^-- WAITER CAN BE PREEMPTED AFTER BEING UNBLOCKED... | |
{ /** Critical Section | |
dec cond.waiters_count | |
/*** ^^- ...AND BEFORE DECREMENTING THE COUNT (1) | |
last_waiter = ( cond.was_broadcast && | |
cond.waiters_count == 0 ) | |
if ( last_waiter ) | |
cond.was_broadcast = FALSE | |
endif | |
} | |
if ( last_waiter ) | |
/* | |
/* Atomic only if using Win32 SignalObjectAndWait | |
/* | |
cond.auto_reset_event_or_sem.post /* Event for Win32 | |
cond.mtx.acquire | |
/*** ^^-- ...AND BEFORE CALL TO mtx.acquire (2) | |
/*** ^^-- NESTED BROADCASTS RESULT IN A DEADLOCK | |
else | |
cond.mtx.acquire | |
/*** ^^-- ...AND BEFORE CALL TO mtx.acquire (3) | |
endif | |
================== | |
broadcast threads: | |
================== | |
{ /** Critical Section | |
waiters_count = cond.waiters_count | |
if ( waiters_count != 0 ) | |
cond.was_broadcast = TRUE | |
endif | |
} | |
if ( waiters_count != 0 ) | |
cond.sem.post waiters_count | |
/*** ^^^^^--- SPURIOUS WAKEUPS DUE TO (1) | |
cond.auto_reset_event_or_sem.wait /* Event for Win32 | |
/*** ^^^^^--- DEADLOCK FOR FURTHER BROADCASTS IF THEY | |
HAPPEN TO GO INTO WAIT WHILE PREVIOUS | |
BROADCAST IS STILL IN PROGRESS/WAITING | |
endif | |
a) cond.waiters_count does not accurately reflect | |
number of waiters blocked on semaphore - that could | |
result (in the time window when counter is not accurate) | |
in spurios wakeups organised by subsequent _signals | |
and _broadcasts. From standard compliance point of view | |
that is OK but that could be a real problem from | |
performance/efficiency point of view. | |
b) If subsequent broadcast happen to go into wait on | |
cond.auto_reset_event_or_sem before previous | |
broadcast was unblocked from cond.auto_reset_event_or_sem | |
by its last waiter, one of two blocked threads will | |
remain blocked because last_waiter processing code | |
fails to unblock both threads. | |
In the situation with tennisb.c the Player-B was put | |
in a deadlock by noise (broadcast) coming from main | |
thread. And since Player-B holds the game state | |
mutex when it calls broadcast, the whole program | |
stalled: Player-A was deadlocked on mutex and | |
main thread after finishing with producing the noise | |
was deadlocked on mutex too (needed to declare the | |
game over) | |
(I produced the program output above by simply | |
adding ?Sleep( 1 )?: | |
================== | |
broadcast threads: | |
================== | |
{ /** Critical Section | |
waiters_count = cond.waiters_count | |
if ( waiters_count != 0 ) | |
cond.was_broadcast = TRUE | |
endif | |
} | |
if ( waiters_count != 0 ) | |
Sleep( 1 ); //Win32 | |
cond.sem.post waiters_count | |
/*** ^^^^^--- SPURIOUS WAKEUPS DUE TO (1) | |
cond.auto_reset_event_or_sem.wait /* Event for Win32 | |
/*** ^^^^^--- DEADLOCK FOR FURTHER BROADCASTS IF THEY | |
HAPPEN TO GO INTO WAIT WHILE PREVIOUS | |
BROADCAST IS STILL IN PROGRESS/WAITING | |
endif | |
to the source code of pthread-win32 implementation: | |
http://sources.redhat.com/cgi-bin/cvsweb.cgi/pthreads/ | |
condvar.c?rev=1.36&content-type=text/ | |
x-cvsweb-markup&cvsroot=pthreads-win32 | |
if (wereWaiters) | |
{(wereWaiters)sroot=pthreads-win32eb.cgi/pthreads/Yem...m | |
/* | |
* Wake up all waiters | |
*/ | |
Sleep( 1 ); //@AT | |
#ifdef NEED_SEM | |
result = (ptw32_increase_semaphore( &cv->sema, cv->waiters ) | |
? 0 | |
: EINVAL); | |
#else /* NEED_SEM */ | |
result = (ReleaseSemaphore( cv->sema, cv->waiters, NULL ) | |
? 0 | |
: EINVAL); | |
#endif /* NEED_SEM */ | |
} | |
(void) pthread_mutex_unlock(&(cv->waitersLock)); | |
if (wereWaiters && result == 0) | |
{(wereWaiters | |
/* | |
* Wait for all the awakened threads to acquire their part of | |
* the counting semaphore | |
*/ | |
if (WaitForSingleObject (cv->waitersDone, INFINITE) | |
== WAIT_OBJECT_0) | |
{ | |
result = 0; | |
} | |
else | |
{ | |
result = EINVAL; | |
} | |
} | |
return (result); | |
} | |
BTW, on my system (2 CPUs) I can manage to get | |
the program stalled even without any source code | |
modification if I run the tennisb program many | |
times in different shell sessions. | |
=================== | |
pthread-win32 patch | |
=================== | |
struct pthread_cond_t_ { | |
long nWaitersBlocked; /* Number of threads blocked | |
*/ | |
long nWaitersUnblocked; /* Number of threads unblocked | |
*/ | |
long nWaitersToUnblock; /* Number of threads to unblock | |
*/ | |
sem_t semBlockQueue; /* Queue up threads waiting for the | |
*/ | |
/* condition to become signalled | |
*/ | |
sem_t semBlockLock; /* Semaphore that guards access to | |
*/ | |
/* | waiters blocked count/block queue | |
*/ | |
/* +-> Mandatory Sync.LEVEL-1 | |
*/ | |
pthread_mutex_t mtxUnblockLock; /* Mutex that guards access to | |
*/ | |
/* | waiters (to)unblock(ed) counts | |
*/ | |
/* +-> Optional* Sync.LEVEL-2 | |
*/ | |
}; /* Opt*) for _timedwait and | |
cancellation*/ | |
int | |
pthread_cond_init (pthread_cond_t * cond, const pthread_condattr_t * attr) | |
int result = EAGAIN; | |
pthread_cond_t cv = NULL; | |
if (cond == NULL) | |
{(cond | |
return EINVAL; | |
} | |
if ((attr != NULL && *attr != NULL) && | |
((*attr)->pshared == PTHREAD_PROCESS_SHARED)) | |
{ | |
/* | |
* Creating condition variable that can be shared between | |
* processes. | |
*/ | |
result = ENOSYS; | |
goto FAIL0; | |
} | |
cv = (pthread_cond_t) calloc (1, sizeof (*cv)); | |
if (cv == NULL) | |
{(cv | |
result = ENOMEM; | |
goto FAIL0; | |
} | |
cv->nWaitersBlocked = 0; | |
cv->nWaitersUnblocked = 0; | |
cv->nWaitersToUnblock = 0; | |
if (sem_init (&(cv->semBlockLock), 0, 1) != 0) | |
{(sem_init | |
goto FAIL0; | |
} | |
if (sem_init (&(cv->semBlockQueue), 0, 0) != 0) | |
{(sem_init | |
goto FAIL1; | |
} | |
if (pthread_mutex_init (&(cv->mtxUnblockLock), 0) != 0) | |
{(pthread_mutex_init | |
goto FAIL2; | |
} | |
result = 0; | |
goto DONE; | |
/* | |
* ------------- | |
* Failed... | |
* ------------- | |
*/ | |
FAIL2: | |
(void) sem_destroy (&(cv->semBlockQueue)); | |
FAIL1: | |
(void) sem_destroy (&(cv->semBlockLock)); | |
FAIL0: | |
DONE: | |
*cond = cv; | |
return (result); | |
} /* pthread_cond_init */ | |
int | |
pthread_cond_destroy (pthread_cond_t * cond) | |
{ | |
int result = 0; | |
pthread_cond_t cv; | |
/* | |
* Assuming any race condition here is harmless. | |
*/ | |
if (cond == NULL | |
|| *cond == NULL) | |
{ | |
return EINVAL; | |
} | |
if (*cond != (pthread_cond_t) PTW32_OBJECT_AUTO_INIT) | |
{(*cond | |
cv = *cond; | |
/* | |
* Synchronize access to waiters blocked count (LEVEL-1) | |
*/ | |
if (sem_wait(&(cv->semBlockLock)) != 0) | |
{(sem_wait(&(cv->semBlockLock)) | |
return errno; | |
} | |
/* | |
* Synchronize access to waiters (to)unblock(ed) counts (LEVEL-2) | |
*/ | |
if ((result = pthread_mutex_lock(&(cv->mtxUnblockLock))) != 0) | |
{((result | |
(void) sem_post(&(cv->semBlockLock)); | |
return result; | |
} | |
/* | |
* Check whether cv is still busy (still has waiters blocked) | |
*/ | |
if (cv->nWaitersBlocked - cv->nWaitersUnblocked > 0) | |
{(cv->nWaitersBlocked | |
(void) sem_post(&(cv->semBlockLock)); | |
(void) pthread_mutex_unlock(&(cv->mtxUnblockLock)); | |
return EBUSY; | |
} | |
/* | |
* Now it is safe to destroy | |
*/ | |
(void) sem_destroy (&(cv->semBlockLock)); | |
(void) sem_destroy (&(cv->semBlockQueue)); | |
(void) pthread_mutex_unlock (&(cv->mtxUnblockLock)); | |
(void) pthread_mutex_destroy (&(cv->mtxUnblockLock)); | |
free(cv); | |
*cond = NULL; | |
} | |
else | |
{ | |
/* | |
* See notes in ptw32_cond_check_need_init() above also. | |
*/ | |
EnterCriticalSection(&ptw32_cond_test_init_lock); | |
/* | |
* Check again. | |
*/ | |
if (*cond == (pthread_cond_t) PTW32_OBJECT_AUTO_INIT) | |
{(*cond | |
/* | |
* This is all we need to do to destroy a statically | |
* initialised cond that has not yet been used (initialised). | |
* If we get to here, another thread | |
* waiting to initialise this cond will get an EINVAL. | |
*/ | |
*cond = NULL; | |
} | |
else | |
{ | |
/* | |
* The cv has been initialised while we were waiting | |
* so assume it's in use. | |
*/ | |
result = EBUSY; | |
} | |
LeaveCriticalSection(&ptw32_cond_test_init_lock); | |
} | |
return (result); | |
} | |
/* | |
* Arguments for cond_wait_cleanup, since we can only pass a | |
* single void * to it. | |
*/ | |
typedef struct { | |
pthread_mutex_t * mutexPtr; | |
pthread_cond_t cv; | |
int * resultPtr; | |
} ptw32_cond_wait_cleanup_args_t; | |
static void | |
ptw32_cond_wait_cleanup(void * args) | |
{ | |
ptw32_cond_wait_cleanup_args_t * cleanup_args = | |
(ptw32_cond_wait_cleanup_args_t *) args; | |
pthread_cond_t cv = cleanup_args->cv; | |
int * resultPtr = cleanup_args->resultPtr; | |
int eLastSignal; /* enum: 1=yes 0=no -1=cancelled/timedout w/o signal(s) | |
*/ | |
int result; | |
/* | |
* Whether we got here as a result of signal/broadcast or because of | |
* timeout on wait or thread cancellation we indicate that we are no | |
* longer waiting. The waiter is responsible for adjusting waiters | |
* (to)unblock(ed) counts (protected by unblock lock). | |
* Unblock lock/Sync.LEVEL-2 supports _timedwait and cancellation. | |
*/ | |
if ((result = pthread_mutex_lock(&(cv->mtxUnblockLock))) != 0) | |
{((result | |
*resultPtr = result; | |
return; | |
} | |
cv->nWaitersUnblocked++; | |
eLastSignal = (cv->nWaitersToUnblock == 0) ? | |
-1 : (--cv->nWaitersToUnblock == 0); | |
/* | |
* No more LEVEL-2 access to waiters (to)unblock(ed) counts needed | |
*/ | |
if ((result = pthread_mutex_unlock(&(cv->mtxUnblockLock))) != 0) | |
{((result | |
*resultPtr = result; | |
return; | |
} | |
/* | |
* If last signal... | |
*/ | |
if (eLastSignal == 1) | |
{(eLastSignal | |
/* | |
* ...it means that we have end of 'atomic' signal/broadcast | |
*/ | |
if (sem_post(&(cv->semBlockLock)) != 0) | |
{(sem_post(&(cv->semBlockLock)) | |
*resultPtr = errno; | |
return; | |
} | |
} | |
/* | |
* If not last signal and not timed out/cancelled wait w/o signal... | |
*/ | |
else if (eLastSignal == 0) | |
{ | |
/* | |
* ...it means that next waiter can go through semaphore | |
*/ | |
if (sem_post(&(cv->semBlockQueue)) != 0) | |
{(sem_post(&(cv->semBlockQueue)) | |
*resultPtr = errno; | |
return; | |
} | |
} | |
/* | |
* XSH: Upon successful return, the mutex has been locked and is owned | |
* by the calling thread | |
*/ | |
if ((result = pthread_mutex_lock(cleanup_args->mutexPtr)) != 0) | |
{((result | |
*resultPtr = result; | |
} | |
} /* ptw32_cond_wait_cleanup */ | |
static int | |
ptw32_cond_timedwait (pthread_cond_t * cond, | |
pthread_mutex_t * mutex, | |
const struct timespec *abstime) | |
{ | |
int result = 0; | |
pthread_cond_t cv; | |
ptw32_cond_wait_cleanup_args_t cleanup_args; | |
if (cond == NULL || *cond == NULL) | |
{(cond | |
return EINVAL; | |
} | |
/* | |
* We do a quick check to see if we need to do more work | |
* to initialise a static condition variable. We check | |
* again inside the guarded section of ptw32_cond_check_need_init() | |
* to avoid race conditions. | |
*/ | |
if (*cond == (pthread_cond_t) PTW32_OBJECT_AUTO_INIT) | |
{(*cond | |
result = ptw32_cond_check_need_init(cond); | |
} | |
if (result != 0 && result != EBUSY) | |
{(result | |
return result; | |
} | |
cv = *cond; | |
/* | |
* Synchronize access to waiters blocked count (LEVEL-1) | |
*/ | |
if (sem_wait(&(cv->semBlockLock)) != 0) | |
{(sem_wait(&(cv->semBlockLock)) | |
return errno; | |
} | |
cv->nWaitersBlocked++; | |
/* | |
* Thats it. Counted means waiting, no more access needed | |
*/ | |
if (sem_post(&(cv->semBlockLock)) != 0) | |
{(sem_post(&(cv->semBlockLock)) | |
return errno; | |
} | |
/* | |
* Setup this waiter cleanup handler | |
*/ | |
cleanup_args.mutexPtr = mutex; | |
cleanup_args.cv = cv; | |
cleanup_args.resultPtr = &result; | |
pthread_cleanup_push (ptw32_cond_wait_cleanup, (void *) &cleanup_args); | |
/* | |
* Now we can release 'mutex' and... | |
*/ | |
if ((result = pthread_mutex_unlock (mutex)) == 0) | |
{((result | |
/* | |
* ...wait to be awakened by | |
* pthread_cond_signal, or | |
* pthread_cond_broadcast, or | |
* timeout, or | |
* thread cancellation | |
* | |
* Note: | |
* | |
* ptw32_sem_timedwait is a cancellation point, | |
* hence providing the mechanism for making | |
* pthread_cond_wait a cancellation point. | |
* We use the cleanup mechanism to ensure we | |
* re-lock the mutex and adjust (to)unblock(ed) waiters | |
* counts if we are cancelled, timed out or signalled. | |
*/ | |
if (ptw32_sem_timedwait (&(cv->semBlockQueue), abstime) != 0) | |
{(ptw32_sem_timedwait | |
result = errno; | |
} | |
} | |
/* | |
* Always cleanup | |
*/ | |
pthread_cleanup_pop (1); | |
/* | |
* "result" can be modified by the cleanup handler. | |
*/ | |
return (result); | |
} /* ptw32_cond_timedwait */ | |
static int | |
ptw32_cond_unblock (pthread_cond_t * cond, | |
int unblockAll) | |
{ | |
int result; | |
pthread_cond_t cv; | |
if (cond == NULL || *cond == NULL) | |
{(cond | |
return EINVAL; | |
} | |
cv = *cond; | |
/* | |
* No-op if the CV is static and hasn't been initialised yet. | |
* Assuming that any race condition is harmless. | |
*/ | |
if (cv == (pthread_cond_t) PTW32_OBJECT_AUTO_INIT) | |
{(cv | |
return 0; | |
} | |
/* | |
* Synchronize access to waiters blocked count (LEVEL-1) | |
*/ | |
if (sem_wait(&(cv->semBlockLock)) != 0) | |
{(sem_wait(&(cv->semBlockLock)) | |
return errno; | |
} | |
/* | |
* Synchronize access to waiters (to)unblock(ed) counts (LEVEL-2) | |
* This sync.level supports _timedwait and cancellation | |
*/ | |
if ((result = pthread_mutex_lock(&(cv->mtxUnblockLock))) != 0) | |
{((result | |
return result; | |
} | |
/* | |
* Adjust waiters blocked and unblocked counts (collect garbage) | |
*/ | |
if (cv->nWaitersUnblocked != 0) | |
{(cv->nWaitersUnblocked | |
cv->nWaitersBlocked -= cv->nWaitersUnblocked; | |
cv->nWaitersUnblocked = 0; | |
} | |
/* | |
* If (after adjustment) there are still some waiters blocked counted... | |
*/ | |
if ( cv->nWaitersBlocked > 0) | |
{( | |
/* | |
* We will unblock first waiter and leave semBlockLock/LEVEL-1 locked | |
* LEVEL-1 access is left disabled until last signal/unblock | |
completes | |
*/ | |
cv->nWaitersToUnblock = (unblockAll) ? cv->nWaitersBlocked : 1; | |
/* | |
* No more LEVEL-2 access to waiters (to)unblock(ed) counts needed | |
* This sync.level supports _timedwait and cancellation | |
*/ | |
if ((result = pthread_mutex_unlock(&(cv->mtxUnblockLock))) != 0) | |
{((result | |
return result; | |
} | |
/* | |
* Now, with LEVEL-2 lock released let first waiter go through | |
semaphore | |
*/ | |
if (sem_post(&(cv->semBlockQueue)) != 0) | |
{(sem_post(&(cv->semBlockQueue)) | |
return errno; | |
} | |
} | |
/* | |
* No waiter blocked - no more LEVEL-1 access to blocked count needed... | |
*/ | |
else if (sem_post(&(cv->semBlockLock)) != 0) | |
{ | |
return errno; | |
} | |
/* | |
* ...and no more LEVEL-2 access to waiters (to)unblock(ed) counts needed | |
too | |
* This sync.level supports _timedwait and cancellation | |
*/ | |
else | |
{ | |
result = pthread_mutex_unlock(&(cv->mtxUnblockLock)); | |
} | |
return(result); | |
} /* ptw32_cond_unblock */ | |
int | |
pthread_cond_wait (pthread_cond_t * cond, | |
pthread_mutex_t * mutex) | |
{ | |
/* The NULL abstime arg means INFINITE waiting. */ | |
return(ptw32_cond_timedwait(cond, mutex, NULL)); | |
} /* pthread_cond_wait */ | |
int | |
pthread_cond_timedwait (pthread_cond_t * cond, | |
pthread_mutex_t * mutex, | |
const struct timespec *abstime) | |
{ | |
if (abstime == NULL) | |
{(abstime | |
return EINVAL; | |
} | |
return(ptw32_cond_timedwait(cond, mutex, abstime)); | |
} /* pthread_cond_timedwait */ | |
int | |
pthread_cond_signal (pthread_cond_t * cond) | |
{ | |
/* The '0'(FALSE) unblockAll arg means unblock ONE waiter. */ | |
return(ptw32_cond_unblock(cond, 0)); | |
} /* pthread_cond_signal */ | |
int | |
pthread_cond_broadcast (pthread_cond_t * cond) | |
{ | |
/* The '1'(TRUE) unblockAll arg means unblock ALL waiters. */ | |
return(ptw32_cond_unblock(cond, 1)); | |
} /* pthread_cond_broadcast */ | |
TEREKHOV@de.ibm.com on 17.01.2001 01:00:57 | |
Please respond to TEREKHOV@de.ibm.com | |
To: pthreads-win32@sourceware.cygnus.com | |
cc: schmidt@uci.edu | |
Subject: win32 conditions: sem+counter+event = broadcast_deadlock + | |
spur.wakeup/unfairness/incorrectness ?? | |
Hi, | |
Problem 1: broadcast_deadlock | |
It seems that current implementation does not provide "atomic" | |
broadcasts. That may lead to "nested" broadcasts... and it seems | |
that nested case is not handled correctly -> producing a broadcast | |
DEADLOCK as a result. | |
Scenario: | |
N (>1) waiting threads W1..N are blocked (in _wait) on condition's | |
semaphore. | |
Thread B1 calls pthread_cond_broadcast, which results in "releasing" N | |
W threads via incrementing semaphore counter by N (stored in | |
cv->waiters) BUT cv->waiters counter does not change!! The caller | |
thread B1 remains blocked on cv->waitersDone event (auto-reset!!) BUT | |
condition is not protected from starting another broadcast (when called | |
on another thread) while still waiting for the "old" broadcast to | |
complete on thread B1. | |
M (>=0, <N) W threads are fast enough to go thru their _wait call and | |
decrement cv->waiters counter. | |
L (N-M) "late" waiter W threads are a) still blocked/not returned from | |
their semaphore wait call or b) were preempted after sem_wait but before | |
lock( &cv->waitersLock ) or c) are blocked on cv->waitersLock. | |
cv->waiters is still > 0 (= L). | |
Another thread B2 (or some W thread from M group) calls | |
pthread_cond_broadcast and gains access to counter... neither a) nor b) | |
prevent thread B2 in pthread_cond_broadcast from gaining access to | |
counter and starting another broadcast ( for c) - it depends on | |
cv->waitersLock scheduling rules: FIFO=OK, PRTY=PROBLEM,... ) | |
That call to pthread_cond_broadcast (on thread B2) will result in | |
incrementing semaphore by cv->waiters (=L) which is INCORRECT (all | |
W1..N were in fact already released by thread B1) and waiting on | |
_auto-reset_ event cv->waitersDone which is DEADLY WRONG (produces a | |
deadlock)... | |
All late W1..L threads now have a chance to complete their _wait call. | |
Last W_L thread sets an auto-reselt event cv->waitersDone which will | |
release either B1 or B2 leaving one of B threads in a deadlock. | |
Problem 2: spur.wakeup/unfairness/incorrectness | |
It seems that: | |
a) because of the same problem with counter which does not reflect the | |
actual number of NOT RELEASED waiters, the signal call may increment | |
a semaphore counter w/o having a waiter blocked on it. That will result | |
in (best case) spurious wake ups - performance degradation due to | |
unnecessary context switches and predicate re-checks and (in worth case) | |
unfairness/incorrectness problem - see b) | |
b) neither signal nor broadcast prevent other threads - "new waiters" | |
(and in the case of signal, the caller thread as well) from going into | |
_wait and overtaking "old" waiters (already released but still not returned | |
from sem_wait on condition's semaphore). Win semaphore just [API DOC]: | |
"Maintains a count between zero and some maximum value, limiting the number | |
of threads that are simultaneously accessing a shared resource." Calling | |
ReleaseSemaphore does not imply (at least not documented) that on return | |
from ReleaseSemaphore all waiters will in fact become released (returned | |
from their Wait... call) and/or that new waiters calling Wait... afterwards | |
will become less importance. It is NOT documented to be an atomic release | |
of | |
waiters... And even if it would be there is still a problem with a thread | |
being preempted after Wait on semaphore and before Wait on cv->waitersLock | |
and scheduling rules for cv->waitersLock itself | |
(??WaitForMultipleObjects??) | |
That may result in unfairness/incorrectness problem as described | |
for SetEvent impl. in "Strategies for Implementing POSIX Condition | |
Variables | |
on Win32": http://www.cs.wustl.edu/~schmidt/win32-cv-1.html | |
Unfairness -- The semantics of the POSIX pthread_cond_broadcast function is | |
to wake up all threads currently blocked in wait calls on the condition | |
variable. The awakened threads then compete for the external_mutex. To | |
ensure | |
fairness, all of these threads should be released from their | |
pthread_cond_wait calls and allowed to recheck their condition expressions | |
before other threads can successfully complete a wait on the condition | |
variable. | |
Unfortunately, the SetEvent implementation above does not guarantee that | |
all | |
threads sleeping on the condition variable when cond_broadcast is called | |
will | |
acquire the external_mutex and check their condition expressions. Although | |
the Pthreads specification does not mandate this degree of fairness, the | |
lack of fairness can cause starvation. | |
To illustrate the unfairness problem, imagine there are 2 threads, C1 and | |
C2, | |
that are blocked in pthread_cond_wait on condition variable not_empty_ that | |
is guarding a thread-safe message queue. Another thread, P1 then places two | |
messages onto the queue and calls pthread_cond_broadcast. If C1 returns | |
from | |
pthread_cond_wait, dequeues and processes the message, and immediately | |
waits | |
again then it and only it may end up acquiring both messages. Thus, C2 will | |
never get a chance to dequeue a message and run. | |
The following illustrates the sequence of events: | |
1. Thread C1 attempts to dequeue and waits on CV non_empty_ | |
2. Thread C2 attempts to dequeue and waits on CV non_empty_ | |
3. Thread P1 enqueues 2 messages and broadcasts to CV not_empty_ | |
4. Thread P1 exits | |
5. Thread C1 wakes up from CV not_empty_, dequeues a message and runs | |
6. Thread C1 waits again on CV not_empty_, immediately dequeues the 2nd | |
message and runs | |
7. Thread C1 exits | |
8. Thread C2 is the only thread left and blocks forever since | |
not_empty_ will never be signaled | |
Depending on the algorithm being implemented, this lack of fairness may | |
yield | |
concurrent programs that have subtle bugs. Of course, application | |
developers | |
should not rely on the fairness semantics of pthread_cond_broadcast. | |
However, | |
there are many cases where fair implementations of condition variables can | |
simplify application code. | |
Incorrectness -- A variation on the unfairness problem described above | |
occurs | |
when a third consumer thread, C3, is allowed to slip through even though it | |
was not waiting on condition variable not_empty_ when a broadcast occurred. | |
To illustrate this, we will use the same scenario as above: 2 threads, C1 | |
and | |
C2, are blocked dequeuing messages from the message queue. Another thread, | |
P1 | |
then places two messages onto the queue and calls pthread_cond_broadcast. | |
C1 | |
returns from pthread_cond_wait, dequeues and processes the message. At this | |
time, C3 acquires the external_mutex, calls pthread_cond_wait and waits on | |
the events in WaitForMultipleObjects. Since C2 has not had a chance to run | |
yet, the BROADCAST event is still signaled. C3 then returns from | |
WaitForMultipleObjects, and dequeues and processes the message in the | |
queue. | |
Thus, C2 will never get a chance to dequeue a message and run. | |
The following illustrates the sequence of events: | |
1. Thread C1 attempts to dequeue and waits on CV non_empty_ | |
2. Thread C2 attempts to dequeue and waits on CV non_empty_ | |
3. Thread P1 enqueues 2 messages and broadcasts to CV not_empty_ | |
4. Thread P1 exits | |
5. Thread C1 wakes up from CV not_empty_, dequeues a message and runs | |
6. Thread C1 exits | |
7. Thread C3 waits on CV not_empty_, immediately dequeues the 2nd | |
message and runs | |
8. Thread C3 exits | |
9. Thread C2 is the only thread left and blocks forever since | |
not_empty_ will never be signaled | |
In the above case, a thread that was not waiting on the condition variable | |
when a broadcast occurred was allowed to proceed. This leads to incorrect | |
semantics for a condition variable. | |
COMMENTS??? | |
regards, | |
alexander. | |
----------------------------------------------------------------------------- | |
Subject: RE: FYI/comp.programming.threads/Re: pthread_cond_* | |
implementation questions | |
Date: Wed, 21 Feb 2001 11:54:47 +0100 | |
From: TEREKHOV@de.ibm.com | |
To: lthomas@arbitrade.com | |
CC: rpj@ise.canberra.edu.au, Thomas Pfaff <tpfaff@gmx.net>, | |
Nanbor Wang <nanbor@cs.wustl.edu> | |
Hi Louis, | |
generation number 8.. | |
had some time to revisit timeouts/spurious wakeup problem.. | |
found some bugs (in 7.b/c/d) and something to improve | |
(7a - using IPC semaphores but it should speedup Win32 | |
version as well). | |
regards, | |
alexander. | |
---------- Algorithm 8a / IMPL_SEM,UNBLOCK_STRATEGY == UNBLOCK_ALL ------ | |
given: | |
semBlockLock - bin.semaphore | |
semBlockQueue - semaphore | |
mtxExternal - mutex or CS | |
mtxUnblockLock - mutex or CS | |
nWaitersGone - int | |
nWaitersBlocked - int | |
nWaitersToUnblock - int | |
wait( timeout ) { | |
[auto: register int result ] // error checking omitted | |
[auto: register int nSignalsWasLeft ] | |
[auto: register int nWaitersWasGone ] | |
sem_wait( semBlockLock ); | |
nWaitersBlocked++; | |
sem_post( semBlockLock ); | |
unlock( mtxExternal ); | |
bTimedOut = sem_wait( semBlockQueue,timeout ); | |
lock( mtxUnblockLock ); | |
if ( 0 != (nSignalsWasLeft = nWaitersToUnblock) ) { | |
if ( bTimeout ) { // timeout (or canceled) | |
if ( 0 != nWaitersBlocked ) { | |
nWaitersBlocked--; | |
} | |
else { | |
nWaitersGone++; // count spurious wakeups | |
} | |
} | |
if ( 0 == --nWaitersToUnblock ) { | |
if ( 0 != nWaitersBlocked ) { | |
sem_post( semBlockLock ); // open the gate | |
nSignalsWasLeft = 0; // do not open the gate below | |
again | |
} | |
else if ( 0 != (nWaitersWasGone = nWaitersGone) ) { | |
nWaitersGone = 0; | |
} | |
} | |
} | |
else if ( INT_MAX/2 == ++nWaitersGone ) { // timeout/canceled or spurious | |
semaphore :-) | |
sem_wait( semBlockLock ); | |
nWaitersBlocked -= nWaitersGone; // something is going on here - | |
test of timeouts? :-) | |
sem_post( semBlockLock ); | |
nWaitersGone = 0; | |
} | |
unlock( mtxUnblockLock ); | |
if ( 1 == nSignalsWasLeft ) { | |
if ( 0 != nWaitersWasGone ) { | |
// sem_adjust( -nWaitersWasGone ); | |
while ( nWaitersWasGone-- ) { | |
sem_wait( semBlockLock ); // better now than spurious | |
later | |
} | |
} | |
sem_post( semBlockLock ); // open the gate | |
} | |
lock( mtxExternal ); | |
return ( bTimedOut ) ? ETIMEOUT : 0; | |
} | |
signal(bAll) { | |
[auto: register int result ] | |
[auto: register int nSignalsToIssue] | |
lock( mtxUnblockLock ); | |
if ( 0 != nWaitersToUnblock ) { // the gate is closed!!! | |
if ( 0 == nWaitersBlocked ) { // NO-OP | |
return unlock( mtxUnblockLock ); | |
} | |
if (bAll) { | |
nWaitersToUnblock += nSignalsToIssue=nWaitersBlocked; | |
nWaitersBlocked = 0; | |
} | |
else { | |
nSignalsToIssue = 1; | |
nWaitersToUnblock++; | |
nWaitersBlocked--; | |
} | |
} | |
else if ( nWaitersBlocked > nWaitersGone ) { // HARMLESS RACE CONDITION! | |
sem_wait( semBlockLock ); // close the gate | |
if ( 0 != nWaitersGone ) { | |
nWaitersBlocked -= nWaitersGone; | |
nWaitersGone = 0; | |
} | |
if (bAll) { | |
nSignalsToIssue = nWaitersToUnblock = nWaitersBlocked; | |
nWaitersBlocked = 0; | |
} | |
else { | |
nSignalsToIssue = nWaitersToUnblock = 1; | |
nWaitersBlocked--; | |
} | |
} | |
else { // NO-OP | |
return unlock( mtxUnblockLock ); | |
} | |
unlock( mtxUnblockLock ); | |
sem_post( semBlockQueue,nSignalsToIssue ); | |
return result; | |
} | |
---------- Algorithm 8b / IMPL_SEM,UNBLOCK_STRATEGY == UNBLOCK_ONEBYONE | |
------ | |
given: | |
semBlockLock - bin.semaphore | |
semBlockQueue - bin.semaphore | |
mtxExternal - mutex or CS | |
mtxUnblockLock - mutex or CS | |
nWaitersGone - int | |
nWaitersBlocked - int | |
nWaitersToUnblock - int | |
wait( timeout ) { | |
[auto: register int result ] // error checking omitted | |
[auto: register int nWaitersWasGone ] | |
[auto: register int nSignalsWasLeft ] | |
sem_wait( semBlockLock ); | |
nWaitersBlocked++; | |
sem_post( semBlockLock ); | |
unlock( mtxExternal ); | |
bTimedOut = sem_wait( semBlockQueue,timeout ); | |
lock( mtxUnblockLock ); | |
if ( 0 != (nSignalsWasLeft = nWaitersToUnblock) ) { | |
if ( bTimeout ) { // timeout (or canceled) | |
if ( 0 != nWaitersBlocked ) { | |
nWaitersBlocked--; | |
nSignalsWasLeft = 0; // do not unblock next waiter | |
below (already unblocked) | |
} | |
else { | |
nWaitersGone = 1; // spurious wakeup pending!! | |
} | |
} | |
if ( 0 == --nWaitersToUnblock && | |
if ( 0 != nWaitersBlocked ) { | |
sem_post( semBlockLock ); // open the gate | |
nSignalsWasLeft = 0; // do not open the gate below | |
again | |
} | |
else if ( 0 != (nWaitersWasGone = nWaitersGone) ) { | |
nWaitersGone = 0; | |
} | |
} | |
} | |
else if ( INT_MAX/2 == ++nWaitersGone ) { // timeout/canceled or spurious | |
semaphore :-) | |
sem_wait( semBlockLock ); | |
nWaitersBlocked -= nWaitersGone; // something is going on here - | |
test of timeouts? :-) | |
sem_post( semBlockLock ); | |
nWaitersGone = 0; | |
} | |
unlock( mtxUnblockLock ); | |
if ( 1 == nSignalsWasLeft ) { | |
if ( 0 != nWaitersWasGone ) { | |
// sem_adjust( -1 ); | |
sem_wait( semBlockQueue ); // better now than spurious | |
later | |
} | |
sem_post( semBlockLock ); // open the gate | |
} | |
else if ( 0 != nSignalsWasLeft ) { | |
sem_post( semBlockQueue ); // unblock next waiter | |
} | |
lock( mtxExternal ); | |
return ( bTimedOut ) ? ETIMEOUT : 0; | |
} | |
signal(bAll) { | |
[auto: register int result ] | |
lock( mtxUnblockLock ); | |
if ( 0 != nWaitersToUnblock ) { // the gate is closed!!! | |
if ( 0 == nWaitersBlocked ) { // NO-OP | |
return unlock( mtxUnblockLock ); | |
} | |
if (bAll) { | |
nWaitersToUnblock += nWaitersBlocked; | |
nWaitersBlocked = 0; | |
} | |
else { | |
nWaitersToUnblock++; | |
nWaitersBlocked--; | |
} | |
unlock( mtxUnblockLock ); | |
} | |
else if ( nWaitersBlocked > nWaitersGone ) { // HARMLESS RACE CONDITION! | |
sem_wait( semBlockLock ); // close the gate | |
if ( 0 != nWaitersGone ) { | |
nWaitersBlocked -= nWaitersGone; | |
nWaitersGone = 0; | |
} | |
if (bAll) { | |
nWaitersToUnblock = nWaitersBlocked; | |
nWaitersBlocked = 0; | |
} | |
else { | |
nWaitersToUnblock = 1; | |
nWaitersBlocked--; | |
} | |
unlock( mtxUnblockLock ); | |
sem_post( semBlockQueue ); | |
} | |
else { // NO-OP | |
unlock( mtxUnblockLock ); | |
} | |
return result; | |
} | |
---------- Algorithm 8c / IMPL_EVENT,UNBLOCK_STRATEGY == UNBLOCK_ONEBYONE | |
--------- | |
given: | |
hevBlockLock - auto-reset event | |
hevBlockQueue - auto-reset event | |
mtxExternal - mutex or CS | |
mtxUnblockLock - mutex or CS | |
nWaitersGone - int | |
nWaitersBlocked - int | |
nWaitersToUnblock - int | |
wait( timeout ) { | |
[auto: register int result ] // error checking omitted | |
[auto: register int nSignalsWasLeft ] | |
[auto: register int nWaitersWasGone ] | |
wait( hevBlockLock,INFINITE ); | |
nWaitersBlocked++; | |
set_event( hevBlockLock ); | |
unlock( mtxExternal ); | |
bTimedOut = wait( hevBlockQueue,timeout ); | |
lock( mtxUnblockLock ); | |
if ( 0 != (SignalsWasLeft = nWaitersToUnblock) ) { | |
if ( bTimeout ) { // timeout (or canceled) | |
if ( 0 != nWaitersBlocked ) { | |
nWaitersBlocked--; | |
nSignalsWasLeft = 0; // do not unblock next waiter | |
below (already unblocked) | |
} | |
else { | |
nWaitersGone = 1; // spurious wakeup pending!! | |
} | |
} | |
if ( 0 == --nWaitersToUnblock ) | |
if ( 0 != nWaitersBlocked ) { | |
set_event( hevBlockLock ); // open the gate | |
nSignalsWasLeft = 0; // do not open the gate below | |
again | |
} | |
else if ( 0 != (nWaitersWasGone = nWaitersGone) ) { | |
nWaitersGone = 0; | |
} | |
} | |
} | |
else if ( INT_MAX/2 == ++nWaitersGone ) { // timeout/canceled or spurious | |
event :-) | |
wait( hevBlockLock,INFINITE ); | |
nWaitersBlocked -= nWaitersGone; // something is going on here - | |
test of timeouts? :-) | |
set_event( hevBlockLock ); | |
nWaitersGone = 0; | |
} | |
unlock( mtxUnblockLock ); | |
if ( 1 == nSignalsWasLeft ) { | |
if ( 0 != nWaitersWasGone ) { | |
reset_event( hevBlockQueue ); // better now than spurious | |
later | |
} | |
set_event( hevBlockLock ); // open the gate | |
} | |
else if ( 0 != nSignalsWasLeft ) { | |
set_event( hevBlockQueue ); // unblock next waiter | |
} | |
lock( mtxExternal ); | |
return ( bTimedOut ) ? ETIMEOUT : 0; | |
} | |
signal(bAll) { | |
[auto: register int result ] | |
lock( mtxUnblockLock ); | |
if ( 0 != nWaitersToUnblock ) { // the gate is closed!!! | |
if ( 0 == nWaitersBlocked ) { // NO-OP | |
return unlock( mtxUnblockLock ); | |
} | |
if (bAll) { | |
nWaitersToUnblock += nWaitersBlocked; | |
nWaitersBlocked = 0; | |
} | |
else { | |
nWaitersToUnblock++; | |
nWaitersBlocked--; | |
} | |
unlock( mtxUnblockLock ); | |
} | |
else if ( nWaitersBlocked > nWaitersGone ) { // HARMLESS RACE CONDITION! | |
wait( hevBlockLock,INFINITE ); // close the gate | |
if ( 0 != nWaitersGone ) { | |
nWaitersBlocked -= nWaitersGone; | |
nWaitersGone = 0; | |
} | |
if (bAll) { | |
nWaitersToUnblock = nWaitersBlocked; | |
nWaitersBlocked = 0; | |
} | |
else { | |
nWaitersToUnblock = 1; | |
nWaitersBlocked--; | |
} | |
unlock( mtxUnblockLock ); | |
set_event( hevBlockQueue ); | |
} | |
else { // NO-OP | |
unlock( mtxUnblockLock ); | |
} | |
return result; | |
} | |
---------- Algorithm 8d / IMPL_EVENT,UNBLOCK_STRATEGY == UNBLOCK_ALL ------ | |
given: | |
hevBlockLock - auto-reset event | |
hevBlockQueueS - auto-reset event // for signals | |
hevBlockQueueB - manual-reset even // for broadcasts | |
mtxExternal - mutex or CS | |
mtxUnblockLock - mutex or CS | |
eBroadcast - int // 0: no broadcast, 1: broadcast, 2: | |
broadcast after signal(s) | |
nWaitersGone - int | |
nWaitersBlocked - int | |
nWaitersToUnblock - int | |
wait( timeout ) { | |
[auto: register int result ] // error checking omitted | |
[auto: register int eWasBroadcast ] | |
[auto: register int nSignalsWasLeft ] | |
[auto: register int nWaitersWasGone ] | |
wait( hevBlockLock,INFINITE ); | |
nWaitersBlocked++; | |
set_event( hevBlockLock ); | |
unlock( mtxExternal ); | |
bTimedOut = waitformultiple( hevBlockQueueS,hevBlockQueueB,timeout,ONE ); | |
lock( mtxUnblockLock ); | |
if ( 0 != (SignalsWasLeft = nWaitersToUnblock) ) { | |
if ( bTimeout ) { // timeout (or canceled) | |
if ( 0 != nWaitersBlocked ) { | |
nWaitersBlocked--; | |
nSignalsWasLeft = 0; // do not unblock next waiter | |
below (already unblocked) | |
} | |
else if ( 1 != eBroadcast ) { | |
nWaitersGone = 1; | |
} | |
} | |
if ( 0 == --nWaitersToUnblock ) { | |
if ( 0 != nWaitersBlocked ) { | |
set_event( hevBlockLock ); // open the gate | |
nSignalsWasLeft = 0; // do not open the gate below | |
again | |
} | |
else { | |
if ( 0 != (eWasBroadcast = eBroadcast) ) { | |
eBroadcast = 0; | |
} | |
if ( 0 != (nWaitersWasGone = nWaitersGone ) { | |
nWaitersGone = 0; | |
} | |
} | |
} | |
else if ( 0 != eBroadcast ) { | |
nSignalsWasLeft = 0; // do not unblock next waiter | |
below (already unblocked) | |
} | |
} | |
else if ( INT_MAX/2 == ++nWaitersGone ) { // timeout/canceled or spurious | |
event :-) | |
wait( hevBlockLock,INFINITE ); | |
nWaitersBlocked -= nWaitersGone; // something is going on here - | |
test of timeouts? :-) | |
set_event( hevBlockLock ); | |
nWaitersGone = 0; | |
} | |
unlock( mtxUnblockLock ); | |
if ( 1 == nSignalsWasLeft ) { | |
if ( 0 != eWasBroadcast ) { | |
reset_event( hevBlockQueueB ); | |
} | |
if ( 0 != nWaitersWasGone ) { | |
reset_event( hevBlockQueueS ); // better now than spurious | |
later | |
} | |
set_event( hevBlockLock ); // open the gate | |
} | |
else if ( 0 != nSignalsWasLeft ) { | |
set_event( hevBlockQueueS ); // unblock next waiter | |
} | |
lock( mtxExternal ); | |
return ( bTimedOut ) ? ETIMEOUT : 0; | |
} | |
signal(bAll) { | |
[auto: register int result ] | |
[auto: register HANDLE hevBlockQueue ] | |
lock( mtxUnblockLock ); | |
if ( 0 != nWaitersToUnblock ) { // the gate is closed!!! | |
if ( 0 == nWaitersBlocked ) { // NO-OP | |
return unlock( mtxUnblockLock ); | |
} | |
if (bAll) { | |
nWaitersToUnblock += nWaitersBlocked; | |
nWaitersBlocked = 0; | |
eBroadcast = 2; | |
hevBlockQueue = hevBlockQueueB; | |
} | |
else { | |
nWaitersToUnblock++; | |
nWaitersBlocked--; | |
return unlock( mtxUnblockLock ); | |
} | |
} | |
else if ( nWaitersBlocked > nWaitersGone ) { // HARMLESS RACE CONDITION! | |
wait( hevBlockLock,INFINITE ); // close the gate | |
if ( 0 != nWaitersGone ) { | |
nWaitersBlocked -= nWaitersGone; | |
nWaitersGone = 0; | |
} | |
if (bAll) { | |
nWaitersToUnblock = nWaitersBlocked; | |
nWaitersBlocked = 0; | |
eBroadcast = 1; | |
hevBlockQueue = hevBlockQueueB; | |
} | |
else { | |
nWaitersToUnblock = 1; | |
nWaitersBlocked--; | |
hevBlockQueue = hevBlockQueueS; | |
} | |
} | |
else { // NO-OP | |
return unlock( mtxUnblockLock ); | |
} | |
unlock( mtxUnblockLock ); | |
set_event( hevBlockQueue ); | |
return result; | |
} | |
---------------------- Forwarded by Alexander Terekhov/Germany/IBM on | |
02/21/2001 09:13 AM --------------------------- | |
Alexander Terekhov | |
02/20/2001 04:33 PM | |
To: Louis Thomas <lthomas@arbitrade.com> | |
cc: | |
From: Alexander Terekhov/Germany/IBM@IBMDE | |
Subject: RE: FYI/comp.programming.threads/Re: pthread_cond_* implementatio | |
n questions | |
Importance: Normal | |
>Sorry, gotta take a break and work on something else for a while. | |
>Real work | |
>calls, unfortunately. I'll get back to you in two or three days. | |
ok. no problem. here is some more stuff for pauses you might have | |
in between :) | |
---------- Algorithm 7d / IMPL_EVENT,UNBLOCK_STRATEGY == UNBLOCK_ALL ------ | |
given: | |
hevBlockLock - auto-reset event | |
hevBlockQueueS - auto-reset event // for signals | |
hevBlockQueueB - manual-reset even // for broadcasts | |
mtxExternal - mutex or CS | |
mtxUnblockLock - mutex or CS | |
bBroadcast - int | |
nWaitersGone - int | |
nWaitersBlocked - int | |
nWaitersToUnblock - int | |
wait( timeout ) { | |
[auto: register int result ] // error checking omitted | |
[auto: register int bWasBroadcast ] | |
[auto: register int nSignalsWasLeft ] | |
wait( hevBlockLock,INFINITE ); | |
nWaitersBlocked++; | |
set_event( hevBlockLock ); | |
unlock( mtxExternal ); | |
bTimedOut = waitformultiple( hevBlockQueueS,hevBlockQueueB,timeout,ONE ); | |
lock( mtxUnblockLock ); | |
if ( 0 != (SignalsWasLeft = nWaitersToUnblock) ) { | |
if ( bTimeout ) { // timeout (or canceled) | |
if ( 0 != nWaitersBlocked ) { | |
nWaitersBlocked--; | |
nSignalsWasLeft = 0; // do not unblock next waiter | |
below (already unblocked) | |
} | |
else if ( !bBroadcast ) { | |
wait( hevBlockQueueS,INFINITE ); // better now than spurious | |
later | |
} | |
} | |
if ( 0 == --nWaitersToUnblock ) { | |
if ( 0 != nWaitersBlocked ) { | |
if ( bBroadcast ) { | |
reset_event( hevBlockQueueB ); | |
bBroadcast = false; | |
} | |
set_event( hevBlockLock ); // open the gate | |
nSignalsWasLeft = 0; // do not open the gate below | |
again | |
} | |
else if ( false != (bWasBroadcast = bBroadcast) ) { | |
bBroadcast = false; | |
} | |
} | |
else { | |
bWasBroadcast = bBroadcast; | |
} | |
} | |
else if ( INT_MAX/2 == ++nWaitersGone ) { // timeout/canceled or spurious | |
event :-) | |
wait( hevBlockLock,INFINITE ); | |
nWaitersBlocked -= nWaitersGone; // something is going on here - | |
test of timeouts? :-) | |
set_event( hevBlockLock ); | |
nWaitersGone = 0; | |
} | |
unlock( mtxUnblockLock ); | |
if ( 1 == nSignalsWasLeft ) { | |
if ( bWasBroadcast ) { | |
reset_event( hevBlockQueueB ); | |
} | |
set_event( hevBlockLock ); // open the gate | |
} | |
else if ( 0 != nSignalsWasLeft && !bWasBroadcast ) { | |
set_event( hevBlockQueueS ); // unblock next waiter | |
} | |
lock( mtxExternal ); | |
return ( bTimedOut ) ? ETIMEOUT : 0; | |
} | |
signal(bAll) { | |
[auto: register int result ] | |
[auto: register HANDLE hevBlockQueue ] | |
lock( mtxUnblockLock ); | |
if ( 0 != nWaitersToUnblock ) { // the gate is closed!!! | |
if ( 0 == nWaitersBlocked ) { // NO-OP | |
return unlock( mtxUnblockLock ); | |
} | |
if (bAll) { | |
nWaitersToUnblock += nWaitersBlocked; | |
nWaitersBlocked = 0; | |
bBroadcast = true; | |
hevBlockQueue = hevBlockQueueB; | |
} | |
else { | |
nWaitersToUnblock++; | |
nWaitersBlocked--; | |
return unlock( mtxUnblockLock ); | |
} | |
} | |
else if ( nWaitersBlocked > nWaitersGone ) { // HARMLESS RACE CONDITION! | |
wait( hevBlockLock,INFINITE ); // close the gate | |
if ( 0 != nWaitersGone ) { | |
nWaitersBlocked -= nWaitersGone; | |
nWaitersGone = 0; | |
} | |
if (bAll) { | |
nWaitersToUnblock = nWaitersBlocked; | |
nWaitersBlocked = 0; | |
bBroadcast = true; | |
hevBlockQueue = hevBlockQueueB; | |
} | |
else { | |
nWaitersToUnblock = 1; | |
nWaitersBlocked--; | |
hevBlockQueue = hevBlockQueueS; | |
} | |
} | |
else { // NO-OP | |
return unlock( mtxUnblockLock ); | |
} | |
unlock( mtxUnblockLock ); | |
set_event( hevBlockQueue ); | |
return result; | |
} | |
---------------------------------------------------------------------------- | |
Subject: RE: FYI/comp.programming.threads/Re: pthread_cond_* implementatio | |
n questions | |
Date: Mon, 26 Feb 2001 22:20:12 -0600 | |
From: Louis Thomas <lthomas@arbitrade.com> | |
To: "'TEREKHOV@de.ibm.com'" <TEREKHOV@de.ibm.com> | |
CC: rpj@ise.canberra.edu.au, Thomas Pfaff <tpfaff@gmx.net>, | |
Nanbor Wang | |
<nanbor@cs.wustl.edu> | |
Sorry all. Busy week. | |
> this insures the fairness | |
> which POSIX does not (e.g. two subsequent broadcasts - the gate does | |
insure | |
> that first wave waiters will start the race for the mutex before waiters | |
> from the second wave - Linux pthreads process/unblock both waves | |
> concurrently...) | |
I'm not sure how we are any more fair about this than Linux. We certainly | |
don't guarantee that the threads released by the first broadcast will get | |
the external mutex before the threads of the second wave. In fact, it is | |
possible that those threads will never get the external mutex if there is | |
enough contention for it. | |
> e.g. i was thinking about implementation with a pool of | |
> N semaphores/counters [...] | |
I considered that too. The problem is as you mentioned in a). You really | |
need to assign threads to semaphores once you know how you want to wake them | |
up, not when they first begin waiting which is the only time you can assign | |
them. | |
> well, i am not quite sure that i've fully understood your scenario, | |
Hmm. Well, it think it's an important example, so I'll try again. First, we | |
have thread A which we KNOW is waiting on a condition. As soon as it becomes | |
unblocked for any reason, we will know because it will set a flag. Since the | |
flag is not set, we are 100% confident that thread A is waiting on the | |
condition. We have another thread, thread B, which has acquired the mutex | |
and is about to wait on the condition. Thus it is pretty clear that at any | |
point, either just A is waiting, or A and B are waiting. Now thread C comes | |
along. C is about to do a broadcast on the condition. A broadcast is | |
guaranteed to unblock all threads currently waiting on a condition, right? | |
Again, we said that either just A is waiting, or A and B are both waiting. | |
So, when C does its broadcast, depending upon whether B has started waiting | |
or not, thread C will unblock A or unblock A and B. Either way, C must | |
unblock A, right? | |
Now, you said anything that happens is correct so long as a) "a signal is | |
not lost between unlocking the mutex and waiting on the condition" and b) "a | |
thread must not steal a signal it sent", correct? Requirement b) is easy to | |
satisfy: in this scenario, thread C will never wait on the condition, so it | |
won't steal any signals. Requirement a) is not hard either. The only way we | |
could fail to meet requirement a) in this scenario is if thread B was | |
started waiting but didn't wake up because a signal was lost. This will not | |
happen. | |
Now, here is what happens. Assume thread C beats thread B. Thread C looks to | |
see how many threads are waiting on the condition. Thread C sees just one | |
thread, thread A, waiting. It does a broadcast waking up just one thread | |
because just one thread is waiting. Next, before A can become unblocked, | |
thread B begins waiting. Now there are two threads waiting, but only one | |
will be unblocked. Suppose B wins. B will become unblocked. A will not | |
become unblocked, because C only unblocked one thread (sema_post cond, 1). | |
So at the end, B finishes and A remains blocked. | |
We have met both of your requirements, so by your rules, this is an | |
acceptable outcome. However, I think that the spec says this is an | |
unacceptable outcome! We know for certain that A was waiting and that C did | |
a broadcast, but A did not become unblocked! Yet, the spec says that a | |
broadcast wakes up all waiting threads. This did not happen. Do you agree | |
that this shows your rules are not strict enough? | |
> and what about N2? :) this one does allow almost everything. | |
Don't get me started about rule #2. I'll NEVER advocate an algorithm that | |
uses rule 2 as an excuse to suck! | |
> but it is done (decrement)under mutex protection - this is not a subject | |
> of a race condition. | |
You are correct. My mistake. | |
> i would remove "_bTimedOut=false".. after all, it was a real timeout.. | |
I disagree. A thread that can't successfully retract its waiter status can't | |
really have timed out. If a thread can't return without executing extra code | |
to deal with the fact that someone tried to unblock it, I think it is a poor | |
idea to pretend we | |
didn't realize someone was trying to signal us. After all, a signal is more | |
important than a time out. | |
> when nSignaled != 0, it is possible to update nWaiters (--) and do not | |
> touch nGone | |
I realize this, but I was thinking that writing it the other ways saves | |
another if statement. | |
> adjust only if nGone != 0 and save one cache memory write - probably much | |
slower than 'if' | |
Hmm. You are probably right. | |
> well, in a strange (e.g. timeout test) program you may (theoretically) | |
> have an overflow of nWaiters/nGone counters (with waiters repeatedly | |
timing | |
> out and no signals at all). | |
Also true. Not only that, but you also have the possibility that one could | |
overflow the number of waiters as well! However, considering the limit you | |
have chosen for nWaitersGone, I suppose it is unlikely that anyone would be | |
able to get INT_MAX/2 threads waiting on a single condition. :) | |
Analysis of 8a: | |
It looks correct to me. | |
What are IPC semaphores? | |
In the line where you state, "else if ( nWaitersBlocked > nWaitersGone ) { | |
// HARMLESS RACE CONDITION!" there is no race condition for nWaitersGone | |
because nWaitersGone is never modified without holding mtxUnblockLock. You | |
are correct that there is a harmless race on nWaitersBlocked, which can | |
increase and make the condition become true just after we check it. If this | |
happens, we interpret it as the wait starting after the signal. | |
I like your optimization of this. You could improve Alg. 6 as follows: | |
---------- Algorithm 6b ---------- | |
signal(bAll) { | |
_nSig=0 | |
lock counters | |
// this is safe because nWaiting can only be decremented by a thread that | |
// owns counters and nGone can only be changed by a thread that owns | |
counters. | |
if (nWaiting>nGone) { | |
if (0==nSignaled) { | |
sema_wait gate // close gate if not already closed | |
} | |
if (nGone>0) { | |
nWaiting-=nGone | |
nGone=0 | |
} | |
_nSig=bAll?nWaiting:1 | |
nSignaled+=_nSig | |
nWaiting-=_nSig | |
} | |
unlock counters | |
if (0!=_nSig) { | |
sema_post queue, _nSig | |
} | |
} | |
---------- ---------- ---------- | |
I guess this wouldn't apply to Alg 8a because nWaitersGone changes meanings | |
depending upon whether the gate is open or closed. | |
In the loop "while ( nWaitersWasGone-- ) {" you do a sema_wait on | |
semBlockLock. Perhaps waiting on semBlockQueue would be a better idea. | |
What have you gained by making the last thread to be signaled do the waits | |
for all the timed out threads, besides added complexity? It took me a long | |
time to figure out what your objective was with this, to realize you were | |
using nWaitersGone to mean two different things, and to verify that you | |
hadn't introduced any bug by doing this. Even now I'm not 100% sure. | |
What has all this playing about with nWaitersGone really gained us besides a | |
lot of complexity (it is much harder to verify that this solution is | |
correct), execution overhead (we now have a lot more if statements to | |
evaluate), and space overhead (more space for the extra code, and another | |
integer in our data)? We did manage to save a lock/unlock pair in an | |
uncommon case (when a time out occurs) at the above mentioned expenses in | |
the common cases. | |
As for 8b, c, and d, they look ok though I haven't studied them thoroughly. | |
What would you use them for? | |
Later, | |
-Louis! :) | |
----------------------------------------------------------------------------- | |
Subject: RE: FYI/comp.programming.threads/Re: pthread_cond_* implementatio | |
n questions | |
Date: Tue, 27 Feb 2001 15:51:28 +0100 | |
From: TEREKHOV@de.ibm.com | |
To: Louis Thomas <lthomas@arbitrade.com> | |
CC: rpj@ise.canberra.edu.au, Thomas Pfaff <tpfaff@gmx.net>, | |
Nanbor Wang <nanbor@cs.wustl.edu> | |
Hi Louis, | |
>> that first wave waiters will start the race for the mutex before waiters | |
>> from the second wave - Linux pthreads process/unblock both waves | |
>> concurrently...) | |
> | |
>I'm not sure how we are any more fair about this than Linux. We certainly | |
>don't guarantee that the threads released by the first broadcast will get | |
>the external mutex before the threads of the second wave. In fact, it is | |
>possible that those threads will never get the external mutex if there is | |
>enough contention for it. | |
correct. but gate is nevertheless more fair than Linux because of the | |
barrier it establishes between two races (1st and 2nd wave waiters) for | |
the mutex which under 'normal' circumstances (e.g. all threads of equal | |
priorities,..) will 'probably' result in fair behaviour with respect to | |
mutex ownership. | |
>> well, i am not quite sure that i've fully understood your scenario, | |
> | |
>Hmm. Well, it think it's an important example, so I'll try again. ... | |
ok. now i seem to understand this example. well, now it seems to me | |
that the only meaningful rule is just: | |
a) "a signal is not lost between unlocking the mutex and waiting on the | |
condition" | |
and that the rule | |
b) "a thread must not steal a signal it sent" | |
is not needed at all because a thread which violates b) also violates a). | |
i'll try to explain.. | |
i think that the most important thing is how POSIX defines waiter's | |
visibility: | |
"if another thread is able to acquire the mutex after the about-to-block | |
thread | |
has released it, then a subsequent call to pthread_cond_signal() or | |
pthread_cond_broadcast() in that thread behaves as if it were issued after | |
the about-to-block thread has blocked. " | |
my understanding is the following: | |
1) there is no guarantees whatsoever with respect to whether | |
signal/broadcast | |
will actually unblock any 'waiter' if it is done w/o acquiring the mutex | |
first | |
(note that a thread may release it before signal/broadcast - it does not | |
matter). | |
2) it is guaranteed that waiters become 'visible' - eligible for unblock as | |
soon | |
as signalling thread acquires the mutex (but not before!!) | |
so.. | |
>So, when C does its broadcast, depending upon whether B has started | |
waiting | |
>or not, thread C will unblock A or unblock A and B. Either way, C must | |
>unblock A, right? | |
right. but only if C did acquire the mutex prior to broadcast (it may | |
release it before broadcast as well). | |
implementation will violate waiters visibility rule (signal will become | |
lost) | |
if C will not unblock A. | |
>Now, here is what happens. Assume thread C beats thread B. Thread C looks | |
to | |
>see how many threads are waiting on the condition. Thread C sees just one | |
>thread, thread A, waiting. It does a broadcast waking up just one thread | |
>because just one thread is waiting. Next, before A can become unblocked, | |
>thread B begins waiting. Now there are two threads waiting, but only one | |
>will be unblocked. Suppose B wins. B will become unblocked. A will not | |
>become unblocked, because C only unblocked one thread (sema_post cond, 1). | |
>So at the end, B finishes and A remains blocked. | |
thread C did acquire the mutex ("Thread C sees just one thread, thread A, | |
waiting"). beginning from that moment it is guaranteed that subsequent | |
broadcast will unblock A. Otherwise we will have a lost signal with respect | |
to A. I do think that it does not matter whether the signal was physically | |
(completely) lost or was just stolen by another thread (B) - in both cases | |
it was simply lost with respect to A. | |
>..Do you agree that this shows your rules are not strict enough? | |
probably the opposite.. :-) i think that it shows that the only meaningful | |
rule is | |
a) "a signal is not lost between unlocking the mutex and waiting on the | |
condition" | |
with clarification of waiters visibility as defined by POSIX above. | |
>> i would remove "_bTimedOut=false".. after all, it was a real timeout.. | |
> | |
>I disagree. A thread that can't successfully retract its waiter status | |
can't | |
>really have timed out. If a thread can't return without executing extra | |
code | |
>to deal with the fact that someone tried to unblock it, I think it is a | |
poor | |
>idea to pretend we | |
>didn't realize someone was trying to signal us. After all, a signal is | |
more | |
>important than a time out. | |
a) POSIX does allow timed out thread to consume a signal (cancelled is | |
not). | |
b) ETIMEDOUT status just says that: "The time specified by abstime to | |
pthread_cond_timedwait() has passed." | |
c) it seem to me that hiding timeouts would violate "The | |
pthread_cond_timedwait() | |
function is the same as pthread_cond_wait() except that an error is | |
returned if | |
the absolute time specified by abstime passes (that is, system time equals | |
or | |
exceeds abstime) before the condition cond is signaled or broadcasted" | |
because | |
the abs. time did really pass before cond was signaled (waiter was | |
released via semaphore). however, if it really matters, i could imaging | |
that we | |
can save an abs. time of signal/broadcast and compare it with timeout after | |
unblock to find out whether it was a 'real' timeout or not. absent this | |
check | |
i do think that hiding timeouts would result in technical violation of | |
specification.. but i think that this check is not important and we can | |
simply | |
trust timeout error code provided by wait since we are not trying to make | |
'hard' realtime implementation. | |
>What are IPC semaphores? | |
<sys/sem.h> | |
int semctl(int, int, int, ...); | |
int semget(key_t, int, int); | |
int semop(int, struct sembuf *, size_t); | |
they support adjustment of semaphore counter (semvalue) | |
in one single call - imaging Win32 ReleaseSemaphore( hsem,-N ) | |
>In the line where you state, "else if ( nWaitersBlocked > nWaitersGone ) { | |
>// HARMLESS RACE CONDITION!" there is no race condition for nWaitersGone | |
>because nWaitersGone is never modified without holding mtxUnblockLock. You | |
>are correct that there is a harmless race on nWaitersBlocked, which can | |
>increase and make the condition become true just after we check it. If | |
this | |
>happens, we interpret it as the wait starting after the signal. | |
well, the reason why i've asked on comp.programming.threads whether this | |
race | |
condition is harmless or not is that in order to be harmless it should not | |
violate the waiters visibility rule (see above). Fortunately, we increment | |
the counter under protection of external mutex.. so that any (signalling) | |
thread which will acquire the mutex next, should see the updated counter | |
(in signal) according to POSIX memory visibility rules and mutexes | |
(memory barriers). But i am not so sure how it actually works on | |
Win32/INTEL | |
which does not explicitly define any memory visibility rules :( | |
>I like your optimization of this. You could improve Alg. 6 as follows: | |
>---------- Algorithm 6b ---------- | |
>signal(bAll) { | |
> _nSig=0 | |
> lock counters | |
> // this is safe because nWaiting can only be decremented by a thread | |
that | |
> // owns counters and nGone can only be changed by a thread that owns | |
>counters. | |
> if (nWaiting>nGone) { | |
> if (0==nSignaled) { | |
> sema_wait gate // close gate if not already closed | |
> } | |
> if (nGone>0) { | |
> nWaiting-=nGone | |
> nGone=0 | |
> } | |
> _nSig=bAll?nWaiting:1 | |
> nSignaled+=_nSig | |
> nWaiting-=_nSig | |
> } | |
> unlock counters | |
> if (0!=_nSig) { | |
> sema_post queue, _nSig | |
> } | |
>} | |
>---------- ---------- ---------- | |
>I guess this wouldn't apply to Alg 8a because nWaitersGone changes | |
meanings | |
>depending upon whether the gate is open or closed. | |
agree. | |
>In the loop "while ( nWaitersWasGone-- ) {" you do a sema_wait on | |
>semBlockLock. Perhaps waiting on semBlockQueue would be a better idea. | |
you are correct. my mistake. | |
>What have you gained by making the last thread to be signaled do the waits | |
>for all the timed out threads, besides added complexity? It took me a long | |
>time to figure out what your objective was with this, to realize you were | |
>using nWaitersGone to mean two different things, and to verify that you | |
>hadn't introduced any bug by doing this. Even now I'm not 100% sure. | |
> | |
>What has all this playing about with nWaitersGone really gained us besides | |
a | |
>lot of complexity (it is much harder to verify that this solution is | |
>correct), execution overhead (we now have a lot more if statements to | |
>evaluate), and space overhead (more space for the extra code, and another | |
>integer in our data)? We did manage to save a lock/unlock pair in an | |
>uncommon case (when a time out occurs) at the above mentioned expenses in | |
>the common cases. | |
well, please consider the following: | |
1) with multiple waiters unblocked (but some timed out) the trick with | |
counter | |
seem to ensure potentially higher level of concurrency by not delaying | |
most of unblocked waiters for semaphore cleanup - only the last one | |
will be delayed but all others would already contend/acquire/release | |
the external mutex - the critical section protected by mtxUnblockLock is | |
made smaller (increment + couple of IFs is faster than system/kernel call) | |
which i think is good in general. however, you are right, this is done | |
at expense of 'normal' waiters.. | |
2) some semaphore APIs (e.g. POSIX IPC sems) do allow to adjust the | |
semaphore counter in one call => less system/kernel calls.. imagine: | |
if ( 1 == nSignalsWasLeft ) { | |
if ( 0 != nWaitersWasGone ) { | |
ReleaseSemaphore( semBlockQueue,-nWaitersWasGone ); // better now | |
than spurious later | |
} | |
sem_post( semBlockLock ); // open the gate | |
} | |
3) even on win32 a single thread doing multiple cleanup calls (to wait) | |
will probably result in faster execution (because of processor caching) | |
than multiple threads each doing a single call to wait. | |
>As for 8b, c, and d, they look ok though I haven't studied them | |
thoroughly. | |
>What would you use them for? | |
8b) for semaphores which do not allow to unblock multiple waiters | |
in a single call to post/release (e.g. POSIX realtime semaphores - | |
<semaphore.h>) | |
8c/8d) for WinCE prior to 3.0 (WinCE 3.0 does have semaphores) | |
ok. so, which one is the 'final' algorithm(s) which we should use in | |
pthreads-win32?? | |
regards, | |
alexander. | |
---------------------------------------------------------------------------- | |
Louis Thomas <lthomas@arbitrade.com> on 02/27/2001 05:20:12 AM | |
Please respond to Louis Thomas <lthomas@arbitrade.com> | |
To: Alexander Terekhov/Germany/IBM@IBMDE | |
cc: rpj@ise.canberra.edu.au, Thomas Pfaff <tpfaff@gmx.net>, Nanbor Wang | |
<nanbor@cs.wustl.edu> | |
Subject: RE: FYI/comp.programming.threads/Re: pthread_cond_* implementatio | |
n questions | |
Sorry all. Busy week. | |
> this insures the fairness | |
> which POSIX does not (e.g. two subsequent broadcasts - the gate does | |
insure | |
> that first wave waiters will start the race for the mutex before waiters | |
> from the second wave - Linux pthreads process/unblock both waves | |
> concurrently...) | |
I'm not sure how we are any more fair about this than Linux. We certainly | |
don't guarantee that the threads released by the first broadcast will get | |
the external mutex before the threads of the second wave. In fact, it is | |
possible that those threads will never get the external mutex if there is | |
enough contention for it. | |
> e.g. i was thinking about implementation with a pool of | |
> N semaphores/counters [...] | |
I considered that too. The problem is as you mentioned in a). You really | |
need to assign threads to semaphores once you know how you want to wake | |
them | |
up, not when they first begin waiting which is the only time you can assign | |
them. | |
> well, i am not quite sure that i've fully understood your scenario, | |
Hmm. Well, it think it's an important example, so I'll try again. First, we | |
have thread A which we KNOW is waiting on a condition. As soon as it | |
becomes | |
unblocked for any reason, we will know because it will set a flag. Since | |
the | |
flag is not set, we are 100% confident that thread A is waiting on the | |
condition. We have another thread, thread B, which has acquired the mutex | |
and is about to wait on the condition. Thus it is pretty clear that at any | |
point, either just A is waiting, or A and B are waiting. Now thread C comes | |
along. C is about to do a broadcast on the condition. A broadcast is | |
guaranteed to unblock all threads currently waiting on a condition, right? | |
Again, we said that either just A is waiting, or A and B are both waiting. | |
So, when C does its broadcast, depending upon whether B has started waiting | |
or not, thread C will unblock A or unblock A and B. Either way, C must | |
unblock A, right? | |
Now, you said anything that happens is correct so long as a) "a signal is | |
not lost between unlocking the mutex and waiting on the condition" and b) | |
"a | |
thread must not steal a signal it sent", correct? Requirement b) is easy to | |
satisfy: in this scenario, thread C will never wait on the condition, so it | |
won't steal any signals. Requirement a) is not hard either. The only way | |
we | |
could fail to meet requirement a) in this scenario is if thread B was | |
started waiting but didn't wake up because a signal was lost. This will not | |
happen. | |
Now, here is what happens. Assume thread C beats thread B. Thread C looks | |
to | |
see how many threads are waiting on the condition. Thread C sees just one | |
thread, thread A, waiting. It does a broadcast waking up just one thread | |
because just one thread is waiting. Next, before A can become unblocked, | |
thread B begins waiting. Now there are two threads waiting, but only one | |
will be unblocked. Suppose B wins. B will become unblocked. A will not | |
become unblocked, because C only unblocked one thread (sema_post cond, 1). | |
So at the end, B finishes and A remains blocked. | |
We have met both of your requirements, so by your rules, this is an | |
acceptable outcome. However, I think that the spec says this is an | |
unacceptable outcome! We know for certain that A was waiting and that C did | |
a broadcast, but A did not become unblocked! Yet, the spec says that a | |
broadcast wakes up all waiting threads. This did not happen. Do you agree | |
that this shows your rules are not strict enough? | |
> and what about N2? :) this one does allow almost everything. | |
Don't get me started about rule #2. I'll NEVER advocate an algorithm that | |
uses rule 2 as an excuse to suck! | |
> but it is done (decrement)under mutex protection - this is not a subject | |
> of a race condition. | |
You are correct. My mistake. | |
> i would remove "_bTimedOut=false".. after all, it was a real timeout.. | |
I disagree. A thread that can't successfully retract its waiter status | |
can't | |
really have timed out. If a thread can't return without executing extra | |
code | |
to deal with the fact that someone tried to unblock it, I think it is a | |
poor | |
idea to pretend we | |
didn't realize someone was trying to signal us. After all, a signal is more | |
important than a time out. | |
> when nSignaled != 0, it is possible to update nWaiters (--) and do not | |
> touch nGone | |
I realize this, but I was thinking that writing it the other ways saves | |
another if statement. | |
> adjust only if nGone != 0 and save one cache memory write - probably much | |
slower than 'if' | |
Hmm. You are probably right. | |
> well, in a strange (e.g. timeout test) program you may (theoretically) | |
> have an overflow of nWaiters/nGone counters (with waiters repeatedly | |
timing | |
> out and no signals at all). | |
Also true. Not only that, but you also have the possibility that one could | |
overflow the number of waiters as well! However, considering the limit you | |
have chosen for nWaitersGone, I suppose it is unlikely that anyone would be | |
able to get INT_MAX/2 threads waiting on a single condition. :) | |
Analysis of 8a: | |
It looks correct to me. | |
What are IPC semaphores? | |
In the line where you state, "else if ( nWaitersBlocked > nWaitersGone ) { | |
// HARMLESS RACE CONDITION!" there is no race condition for nWaitersGone | |
because nWaitersGone is never modified without holding mtxUnblockLock. You | |
are correct that there is a harmless race on nWaitersBlocked, which can | |
increase and make the condition become true just after we check it. If this | |
happens, we interpret it as the wait starting after the signal. | |
I like your optimization of this. You could improve Alg. 6 as follows: | |
---------- Algorithm 6b ---------- | |
signal(bAll) { | |
_nSig=0 | |
lock counters | |
// this is safe because nWaiting can only be decremented by a thread that | |
// owns counters and nGone can only be changed by a thread that owns | |
counters. | |
if (nWaiting>nGone) { | |
if (0==nSignaled) { | |
sema_wait gate // close gate if not already closed | |
} | |
if (nGone>0) { | |
nWaiting-=nGone | |
nGone=0 | |
} | |
_nSig=bAll?nWaiting:1 | |
nSignaled+=_nSig | |
nWaiting-=_nSig | |
} | |
unlock counters | |
if (0!=_nSig) { | |
sema_post queue, _nSig | |
} | |
} | |
---------- ---------- ---------- | |
I guess this wouldn't apply to Alg 8a because nWaitersGone changes meanings | |
depending upon whether the gate is open or closed. | |
In the loop "while ( nWaitersWasGone-- ) {" you do a sema_wait on | |
semBlockLock. Perhaps waiting on semBlockQueue would be a better idea. | |
What have you gained by making the last thread to be signaled do the waits | |
for all the timed out threads, besides added complexity? It took me a long | |
time to figure out what your objective was with this, to realize you were | |
using nWaitersGone to mean two different things, and to verify that you | |
hadn't introduced any bug by doing this. Even now I'm not 100% sure. | |
What has all this playing about with nWaitersGone really gained us besides | |
a | |
lot of complexity (it is much harder to verify that this solution is | |
correct), execution overhead (we now have a lot more if statements to | |
evaluate), and space overhead (more space for the extra code, and another | |
integer in our data)? We did manage to save a lock/unlock pair in an | |
uncommon case (when a time out occurs) at the above mentioned expenses in | |
the common cases. | |
As for 8b, c, and d, they look ok though I haven't studied them thoroughly. | |
What would you use them for? | |
Later, | |
-Louis! :) | |