[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #10990(accepted)

Opened 8 months ago

Last modified 5 weeks ago

Fix synchronize (threading)

Reported by: mark Owned by: tbishop
Component: survey-backend Data Locale:
Phase: dvet Review:
Weeks: Data Xpath:
Xref:

Description

Noticed this code. I have a suspicion that our use of synchronization is a factor in suboptimal performance. Where we are synchronizing on too gross a level we will degrade performance.

public Level getLevel(String path) {

if (path == null) {

return Level.UNDETERMINED;

}
synchronized (lookup) { synchronize on the class, since the Matchers are changed during the matching process

Other options: we have some "kitchen sink" classes that could be split to reduce dependencies, and threading issues.

Attachments

Change History

comment:1 Changed 7 months ago by mark

  • Owner changed from anybody to tbishop
  • Phase changed from dsub to dvet
  • Priority changed from assess to major
  • Status changed from new to accepted
  • Milestone changed from UNSCH to 34

comment:2 Changed 6 months ago by tbishop

This is my first ticket that's specifically for performance improvement. The two main sections of http://cldr.unicode.org/index/cldr-engineer/sow are Performance Improvement Goals and UI Improvement Goals. For performance, "The chief problem is that the performance degrades under higher load."

"Milestone #1: Create scaffolding so that simultaneous input from multiple vetters can be simulated by a tool that interacts with the survey tool, using the common actions specified in the goals."

"Milestone #2: Complete an analysis of the hotspots in loaded operation, and create a document of recommend tasks to address them (in priority order, based on the best ROI). The CLDR committee will then settle on a set of tasks to address the performance issues based on the recommendations"

Before tinkering with synchronization, I'd like to focus on testing: both performance testing to measure improvements, and unit testing to enable refactoring while guarding against unintended changes.

I'll study and employ the tests that already exist, and work on new tests.

Simulated simultaneous input from multiple vetters will be a key part of the new tests, ideally for this ticket 10990 as well as the overall task of performance improvement.

Last edited 6 months ago by tbishop (previous) (diff)

comment:3 Changed 6 months ago by tbishop

SurveyConsole may be a useful testing tool.

Draft of a future tools/SurveyConsole/readme-SurveyConsole.txt:

CLDR SurveyConsole - readme for the cldr/tools/SurveyConsole folder

SurveyConsole is a node.js monitoring app, and may be used as an uptime monitor.

It is a separate application from SurveyTool.

SurveyConsole and SurveyTool are not necessarily always both running at once.

comment:4 Changed 7 weeks ago by tbishop

  • Keywords performance added
  • type changed from unknown to surveytool
  • Component changed from unknown to survey
  • Milestone changed from 34 to 35

comment:5 Changed 5 weeks ago by tbishop

getLevel is in cldr/tools/java/org/unicode/cldr/test/CoverageLevel2.java:

    public Level getLevel(String path) {
        if (path == null) {
            return Level.UNDETERMINED;
        }
        synchronized (lookup) { // synchronize on the class, since the Matchers are changed during the matching process
            Level result;
            if (DEBUG_LOOKUP) { // for testing
                Output<String[]> checkItems = new Output<String[]>();
                Output<Finder> matcherFound = new Output<Finder>();
                List<String> failures = new ArrayList<String>();
                result = lookup.get(path, myInfo, checkItems, matcherFound, failures);
                for (String s : failures) {
                    System.out.println(s);
                }
            } else {
                result = lookup.get(path, myInfo, null);
            }
            return result == null ? Level.OPTIONAL : result;
        }
    }

I ran cldr-apps TestAll.java and measured the total time, alternating with and without "synchronize (lookup)", three times. Results:

With sync: 101.065, 102.460, 100.960 seconds
Without: 99.591, 101.521, 99.304

I also ran cldr-unittest TestAll.java and measured the total time, alternating with and without "synchronize (lookup)", twice. Results:

With sync: 360.299, 346.410
Without: 348.018, 349.391

All tests passed. For performance, the results are inconclusive. The slight differences might be noise, and if removing synchronize has any benefit in this scenario it might be on the order of one percent. However, this is a single-user scenario, so threading isn't getting tested much at all yet. We'll learn more from simulated simultaneous input from multiple vetters.

comment:6 Changed 5 weeks ago by tbishop

Running Survey Tool on the server, this is the stack the first time getLevel is called:

Daemon Thread [{ST Threads: Tasks waiting:0, Current:startup, Running:true}] (Suspended (breakpoint at line 168 in CoverageLevel2))

owns: RegexLookup<T> (id=104)
owns: LocalCache$StrongAccessEntry<K,V> (id=105)
owns: SurveyMain (id=106)
CoverageLevel2.getLevel(String) line: 168
CoverageInfo$1.call() line: 98
CoverageInfo$1.call() line: 1
LocalCache$LocalManualCache$1.load(Object) line: 4870
LocalCache$LoadingValueReference<K,V>.loadFuture(K, CacheLoader<? super K,V>) line: 3524
LocalCache$Segment<K,V>.loadSync(K, int, LoadingValueReference<K,V>, CacheLoader<? super K,V>) line: 2273
LocalCache$Segment<K,V>.lockedGetOrLoad(K, int, CacheLoader<? super K,V>) line: 2156
LocalCache$Segment<K,V>.get(K, int, CacheLoader<? super K,V>) line: 2046
LocalCache<K,V>.get(K, CacheLoader<? super K,V>) line: 3943
LocalCache$LocalManualCache<K,V>.get(K, Callable<? extends V>) line: 4865
CoverageInfo.getCoverageLevel(String, String) line: 86
CoverageInfo.getCoverageValue(String, String) line: 117
SurveyMain.checkAllLocales() line: 5126
SurveyMain.isLocaleAliased(CLDRLocale) line: 5232
SurveyMain.doStartup() line: 5891
SurveyMain$1.run() line: 534
SurveyThread.run() line: 240

comment:7 follow-up: ↓ 8 Changed 5 weeks ago by tbishop

This is odd:

    private synchronized void checkAllLocales() {
        if (aliasMap != null)
            return;

        boolean useCache = isUnofficial(); // NB: do NOT use the cache if we are
        // in unofficial mode. Parsing here
        // doesn't take very long (about
        // 16s), but
        // we want to save some time during development iterations.
        ...

The comment for useCache seems to say the opposite of the current code: currently useCache is true if and only if isUnoffical() returns true; the cache IS used in unofficial mode ONLY. Running on localhost:

isUnofficial() returned true

useCache true

comment:8 in reply to: ↑ 7 ; follow-up: ↓ 9 Changed 5 weeks ago by srl

Replying to tbishop:

This is odd:

boolean useCache = isUnofficial(); NB: do NOT use the cache if we are
in unofficial mode.

Sorry, it should say "in OFFICIAL mode". I messed up the double negative. The cache is used for development (not-production) mode because it speeds up the iteration of speeding up the server a little bit. In production, we want the files to be more carefully checked every time.

comment:9 in reply to: ↑ 8 Changed 5 weeks ago by tbishop

Replying to srl:

Replying to tbishop:

This is odd:

boolean useCache = isUnofficial(); NB: do NOT use the cache if we are
in unofficial mode.

Sorry, it should say "in OFFICIAL mode". I messed up the double negative. The cache is used for development (not-production) mode because it speeds up the iteration of speeding up the server a little bit. In production, we want the files to be more carefully checked every time.

Thanks! So the code is right. I'll fix the comment.

View

Add a comment

Modify Ticket

Action
as accepted
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.