fetch missing bridge descriptors without delay

Without this change, if we have a working bridge, and we add a new bridge,
we will schedule the fetch attempt for that new bridge descriptor for
three hours(!) in the future.

This change is especially needed because of bug #40396, where if you have
one working bridge and one bridge whose descriptor you haven't fetched
yet, your Tor will stall until you have successfully fetched that new
descriptor -- in this case for hours.

In the old design, we would put off all further bridge descriptor fetches
once we had any working bridge descriptor. In this new design, we make the
decision per bridge based on whether we successfully got *its* descriptor.

To make this work, we need to also call learned_bridge_descriptor() every
time we get a bridge descriptor, not just when it's a novel descriptor.

Fixes bug 40396.

Also happens to fix bug 40495 (redundant descriptor fetches for every
bridge) since now we delay fetches once we succeed.

A side effect of this change is that if we have any configured bridges
that *aren't* working, we will keep trying to fetch their descriptors
on the modern directory retry schedule -- every couple of seconds for
the first half minute, then backing off after that -- which is a lot
faster than before.
This commit is contained in:
Roger Dingledine 2021-10-23 04:18:00 -04:00
parent 2f171f30c8
commit d66549c208
5 changed files with 25 additions and 22 deletions

View File

@ -3556,14 +3556,15 @@ The following options are used for running a testing Tor network.
[[TestingAuthKeySlop]] **TestingAuthKeySlop** __N__ **seconds**|**minutes**|**hours** +
[[TestingBridgeBootstrapDownloadInitialDelay]] **TestingBridgeBootstrapDownloadInitialDelay** __N__::
Initial delay in seconds for when clients should download each bridge descriptor when they
have just started, or when they can not contact any of their bridges.
Initial delay in seconds for how long clients should wait before
downloading a bridge descriptor for a new bridge.
Changing this requires that **TestingTorNetwork** is set. (Default: 0)
[[TestingBridgeDownloadInitialDelay]] **TestingBridgeDownloadInitialDelay** __N__::
Initial delay in seconds for when clients should download each bridge descriptor when they
know that one or more of their configured bridges are running. Changing
this requires that **TestingTorNetwork** is set. (Default: 10800)
How long to wait (in seconds) once clients have successfully
downloaded a bridge descriptor, before trying another download for
that same bridge. Changing this requires that **TestingTorNetwork**
is set. (Default: 10800)
[[TestingClientConsensusDownloadInitialDelay]] **TestingClientConsensusDownloadInitialDelay** __N__::
Initial delay in seconds for when clients should download consensuses. Changing this

View File

@ -961,12 +961,14 @@ learned_bridge_descriptor(routerinfo_t *ri, int from_cache)
if (bridge) { /* if we actually want to use this one */
node_t *node;
/* it's here; schedule its re-fetch for a long time from now. */
if (!from_cache) {
/* This schedules the re-fetch at a constant interval, which produces
* a pattern of bridge traffic. But it's better than trying all
* configured bridges several times in the first few minutes. */
download_status_reset(&bridge->fetch_status);
/* it's here; schedule its re-fetch for a long time from now. */
bridge->fetch_status.next_attempt_at +=
get_options()->TestingBridgeDownloadInitialDelay;
}
node = node_get_mutable_by_id(ri->cache_info.identity_digest);

View File

@ -73,15 +73,14 @@ find_dl_min_delay(const download_status_t *dls, const or_options_t *options)
}
}
case DL_SCHED_BRIDGE:
if (options->UseBridges && num_bridges_usable(0) > 0) {
/* A bridge client that is sure that one or more of its bridges are
* running can afford to wait longer to update bridge descriptors. */
return options->TestingBridgeDownloadInitialDelay;
} else {
/* A bridge client which might have no running bridges, must try to
* get bridge descriptors straight away. */
return options->TestingBridgeBootstrapDownloadInitialDelay;
}
/* Be conservative here: always return the 'during bootstrap' delay
* value, so we never delay while trying to fetch descriptors
* for new bridges. Once we do succeed at fetching a descriptor
* for our bridge, we will adjust its next_attempt_at based on
* the longer "TestingBridgeDownloadInitialDelay" value. See
* learned_bridge_descriptor() for details.
*/
return options->TestingBridgeBootstrapDownloadInitialDelay;
default:
tor_assert(0);
}

View File

@ -1617,6 +1617,13 @@ router_add_to_routerlist(routerinfo_t *router, const char **msg,
"descriptor for router %s",
router_describe(router));
} else {
if (router->purpose == ROUTER_PURPOSE_BRIDGE) {
/* Even if we're not going to keep this descriptor, we need to
* let the bridge descriptor fetch subsystem know that we
* succeeded at getting it -- so we can adjust the retry schedule
* to stop trying for a while. */
learned_bridge_descriptor(router, from_cache);
}
log_info(LD_DIR,
"Dropping descriptor that we already have for router %s",
router_describe(router));

View File

@ -6652,13 +6652,7 @@ test_dir_find_dl_min_delay(void* data)
dls.schedule = DL_SCHED_BRIDGE;
/* client */
mock_options->ClientOnly = 1;
mock_options->UseBridges = 1;
if (num_bridges_usable(0) > 0) {
tt_int_op(find_dl_min_delay(&dls, mock_options), OP_EQ, bridge);
} else {
tt_int_op(find_dl_min_delay(&dls, mock_options), OP_EQ, bridge_bootstrap);
}
tt_int_op(find_dl_min_delay(&dls, mock_options), OP_EQ, bridge_bootstrap);
done:
UNMOCK(networkstatus_consensus_is_bootstrapping);