Public Note
Latest text of Note k5XvRql9y9
Saved Feb 9, 2016
Concurrent Framework Project (CF4hep) Meeting notes
----------------Chronologically reversed!!---------------------------
13th of February 
  • Present: Paul, Pere, Danilo, Illya, Daniel, Benedikt, Andrea, Marco
  • Daniel: working on the Tools. Adding data dependencies of tools from the algorithms. 1/2 way 
  • Paul: stagfdfgdfgf LHCb and GadiHive. Presentation to the CF annual meeting.
  • Illya: merging data dependencies and control flow. Extacting the dependencies is not obvious (help from Markus). 
  •  
  • Presentations to Annual CF meeting:
  • Daniel: GaudiHive
  • Illya: Merging control and data flow
  • Paul: Isses with G4 MT and LHCb simulation
 
 
30th of January
  • Present: Andrea, Pere,  Marco, Danilo, Illya, Daniel, Benedikt, Robert, 
  • Round table:
  • Marco. Update of Gaudi GIT with the changes to drop Reflex and make it compatible with ROOT 6. It includes als the ATLAS modifications. Agree that any time during the next 7 days the merge can be done.
  • Daniel. Data handlers update. The mini-Brunel works with the new Data handlers. Danilo suggest that the tests spould ne migrated. The CPUcruncher is now hardwired the number of inputs and outputs. It was agreed to extern the number to more than few.  
  • Illya. Exceptions of GaudiAlg due to being in the wrong state. Fixed.
  • New control flow machinery. Started to work with the Data handler of Daniel. Next will try to combine the data and control flow in a single graph. 
  • Robert: the latest measurments say that 5% of reconstruction can be made parallel in average.
  • Danilo: why we need public tools? They need to be converted to services and solve the service case.
  • Parallelizing the simulation based on Geant4MT could be the next step.
  • Pere: Logic state analyzer for Gaudi.  
12th of December
  • Present: Pere,  Daniel, Danilo, Benedikt,
 
28th of November/
  • Present: Pere,  Daniel, Danilo, Graeme, Marco, Illya,  Benedikt, Andream, Robert, Jovan (remote), 
  • Daniel Presentation:
  • Add the link of the presentation. Good discusion. 
  • Necessity to have micro-benchmark to measure the real ....
  • Round Table:
  • ATLAS/Gaudi convergence. Some more changes are in the pipeline.
  • Robert: Extacting and measuring the optiimal scheduling for the ATLAS reco.
  • Illya: conctinuing the development of CF with a realistist control graph.
  • AoB:
14th of November
  • Present: Pere,  Daniel, Danilo, Graeme,  Benedikt, Marco
  • News: :
  • Pere reporting the Alice meeting. 
  • Communication with GPUs (coprocessor) is a topic for discussion in later meeting. We need to have some implemention of it in GaudiHive.
  • Marco: context specific storage: two implemenations.  With PODs works like a dream. Not yet integrated the chnages from Atlas to the main repository.
  • Benedikt: IEEE report (merge of 2 CHEP prsentations in 12 minutes). 
  • Graeme: Adding ID to AthenaHive is going to be big step. Need to understand the way it works. 
  • Thread sanitiser. It can help but it does not understand TBB way of protecting. It generates lots of fault positives.  
  • Illya: comming back to the task on control flow. New traversals with a real network in Brunel (400 nodes including the sequencers). By the end of the year the general implementation.
  • Daniel: Benchmark in place. Discovered a treadrread unsafe issue. Running the benchmakrs again. 
  • Danilo: David contacted him for the ROOT I/O. 
 
7th of November
  • Present: Pere,  Daniel Funke, Danilo, Andrea, Benedikt, 
  • News: 
  • David Smith contacted Pere to continube to the 'concurrency'. Interest in the concurrent I/O.
  • Status of ATLAS tracking strategy. 
  • Outcome of sprint last week. The main issue is the intergartion of Gaudi[Hive] into the ATLAS stack. It sould be terated a 'external' project. 
  • Progress:
  • Daniel: adding parameters to the the command line to change the value to properties. 
  • Adding the 'outputStream' into Mini-Brunel is becoming a priority. We sould be able to compare the output files between the various scheduling options.
  •  
 
19th of September
  • Present: Pere, Stefan Lohn, Daniel Funke, Marco Clemencic, Benedikt, Andrea, 
  • Round table:
  • Daniel has taken the parallel sequencial scheduler as initial task to get used to the system.
  • Marco. Ilya will be here next week to present the control flow progress. 
  • Stefan. Gave a try to build in ubuntu and had some problems. Marco will provide instructions. 
  • Benedikt. Tools is becaming impoirtant. Need to get some low level structure to handle 'per-slot storage'.
  • Andrea. No yet started. Discussion on the way conditions work in Gaudi.  
  • Sprint session with ATLAS (October 21-25). Is people available? 
  • CHEP presentations:
  • Two abstrcats (Gaudi scaling; design patterns). Add few more measurements with different schedulers (round-robin). 
  • In two meetings from now we should have a look at the draft (October 3rd). Papers should be ready by the last day of the conference.
  • .... 
 
29th of August 
  • Present: Pere, Robert Langenberg, Stefan Lohn, Sandro, Danilo, Marco Clemencic, 
  • Actions for ToolSvc
  • Percolation of data dependencies from tools to algorithms.
  • Instrument ToolSvc to find out when a tool is really share between, late instantiation
  • Taxonomy of tools. Write test cases for each one.
  • Provide solutions for each case.
  • Ilya did some work on the control flow refactoring. The work can presented later when he is back.
  • We need to review the list of JIRA tasks and identify the ones with higher priority 
 
15th of August 
  • Present: Pere, Benedikt, Orlando, Andrea, Illya, Danilo
  • News/Progress
  • Beendikt: Round robin scheduler
  • Illya: Same work on going in the controlflow machinery. No measurements yet. On paper it soukld be faster. Once something concrete will be presented.
  • Danilo: Static analysis tried on Gaudi but is not working. Waiting fro Marco.
  • Andrea: Would like to do something in the conditions area.   
 
6th of June 
  • Present: Danilo, Pere, Benedikt, Marco, Sandro, Rolf, Daniel
  • Feedback from Danilo's presentation:
  • New hardware (2012) not being tested yet. There was no time before the presentation.
  • Output I/O would be nice to be added to make it more realistic.
  • Discussion on the NUMA effects. It seems that these are also visible with Gaudi-MP with late forking. This could be solved (mitigated) by starting two inital workers one in each socket.
  • Disussion on the memory limits. Where are the real limits? Suggested to use 'smap'.
  • Defining the data processing problem to be used for comparing MT with MP and SQ. 
 
30th of May 
  • Present: Danilo, Pere, Benedikt, Marco, Sandro, Graeme, 
  • Reviewed very quickly the two JIRA updates
  • Danilo: discussion on performance plots
  • Round table:
  • Marco: new version of Gaudi in the oven
  • Benedikt: confusion on the existance of several ParticlePropSvc
  •  
 
23rd of May 
  • Present: Danilo, Pere, Benedikt, 
  •  
 
16th of May 
  • Present: Marco. Danilo, Benedikt, Pere, Graeme, Andrea, Sandro
  • Reviwed the modified/open JIRA task
  • Round table:
  • LHCb workshop next week. Possible clash with the concurrency forum
  • Danilo reported on the progress in GaudiHive (Mini-Brunel). Good memory footprint. Trying to understand scaling problems when moving above 4 or 5 threads. 
  • Plots were shown.
  • Questions to answer:
  • why we overcale?
  • why we saturate?
  • how much remote traffict between dies?
 
2nd of May 
  • Present: Marco. Danilo, Benedikt, Pere, Sandro, Riccardo, Illya,
  • Rundtable:
  • Coding conventions. Taken the ones of LHCb and added into GIT. Investigate if can be added in the nightlies (after 0.5)
  • Control Flow. Illya is making progress.
  • Sprint with ATLAS (in June). Try to allocate a room.
  • LXR browsing availble (also linked in GaudiHive page)
  •  
  •  
 
25th of April 
  • Present: Danilo, Benedikt, Pere, Sandro, Marco, Sebastien
  •  
 
 
18th of April 
  • Present: Marco, Benedikt, Graeme, Illya, Rolf, Danilo, Sandro, Pere
  • Going through the slides of Danilo and Benedikt. Comments:
  • We need to measure the penalty of moving '1 event lived cache' to the TES  => Action Pere
  • The 'begin event' incident will be removed and solutions will need to be find for each of the problems found by Danilo.
  • Status of ATLAS
  • Move to new version of Gaudi. Problems with the latest CMT-v26. It was never validated by LHCb. 
  • Wim and Charles making good progress.   
 
11th of April 
  • Present: Marco, Sebastien, Benedikt
  • Status from ATLAS: 
  • Gaudi - AthenaGaudi merge almost done; will switch nightlies to it soon
  • Will push to main Gaudi (CFHEP-85)
  • Waf build system for concurrent Gaudi almost done
  • Marco volunteers to take over the Coverity tests for GaudiHive (CFHEP-79)
  • Discussing CFHEP-78; Marco and Benedikt will look into it
 
4th of April 
  • Present: Graeme, Illya, Rolf, Ric, Sandro, Pere
  •  
 
28th of March, 
  • Present: 
  • Andrea, Sandro, Rolf, Pere, Danilo,
  • Status of ATLAS getting up to speed. Basically the building problems are gone. 
  • Reviewed the changes tasks/stories from last week
 
21th of March, 
  • Present: Danilo, Pere, Sandro, Beneditk, Rolf, Ric, Graeme, Werner, Marco Cl, Illya, Sebastien (remote)
  • Summary of the ATLAS meeting last Friday
  • Rolf review the list of actions agreed on the meeting.
  •   0) sign-up to cf4hep-devel@cern.ch. Use it for all communication. Doodle pool for weekly/fortnightly meeting slot (Wed 17:00 CERN to
  •   1) Try GaudiHive as is in ATLAS svn. Merge needed ATLAS Gaudi changes back in GaudiHive git repository.
  •   2) define a basic calorimeter job, and capture its dataflow. Simulate in GaudiHive.
  •   3) implement an ATLAS GaudiHive event loop mgr
  •   4) implement a multi-event store using StoreGateSvc and ActiveStoreSvc. This HiveStoreSvc should implement the IHiveWhiteBoard.
  •   5) implement ToolSvc that copies all tools by default (leaving them as private to Algo instance). Identify all public tools which are effectively private (using access count) and rewrite them as private. Build a list of public tools that need to be modified (put their state in event store)
  •   6) implement event loader, test with ATLAS event read/write unit tests
  • Not yet fully clear who will be assigned to each one of the actions. Charles to 1), Paolo to 4), Wim to 5)
  • Reviewed the list of modified/created tasks. New assignments:
  • Coding conventions: Benedikt
  • Status of Mini-Brunel
  • Parallel algorithms but single event is now achieved
  • Declaration of inputs is done in the in C++ calling a 'declareData'. Sebastien will have a look at the existing handles to see if we can live with one type of handle. 
  • VarHandle code:
  • example of use of VarHandles:
  • Not touched auditors, nor an strategy. Is any body signing for this?
  • Getting the detector components from the DetDataSvc needs to be threadsafe. Probably a lock to protect the access would be sufficient.
  • Work in gaudiexe to be able to read the python options directly
  • Graeme reports on building problems due to locales.
 
7th of March, 
  • Present: Marco Cl., Danilo, Pere, Sandro, Beneditk, Illya
  • Danilo report: 
  • Testing the sequencial running of full Brunel using the new concurrent services (AlgPool, ForwardScheduler, WhiteBoard,MessageSvc,...). It works and the performance is nor degradated at all.
  • Next is  ...
  • CHEP papers:
  • GaudiHive paper: performance of the work-flows for  LHCb, ATLAS, CMS with proper correnations. ,... + Mini-Brunel
  • Lessons Learnt of the implementations. Histogramming, lock-free data sturctures. Adding concurrency to HEP software. 
  • Participation to the ATLAS software week next week
 
27th of February
  • Present: Marco Cl., Danilo, Pere, 
  • We went through the list of updated tasks (added, updated, done)
  • Ideas for GSoC 
  • Rework of the 'configurables'
  • Concurrent 'Initialization'. Danilo will draft a job description.
 
 
21th of February
  • Present: Marco Cl., Danilo, Pere, 
  • Marco reported on nighly and CDash for GaudiHive
  • Danilo reported on the progress on various tasks including the improved 
 
14th of February
  • Present: Marco Cl., Danilo, Pere, Benedikt, Marco Corvo, Markus, Francesco Giacomini 
  • Review the vision and goals: To be able to run Mini-Brunel with realistic physcis code to be able to compare and measure the all the 'benefits' of running a multi-threaded application. We should have all three versions MJ, MP and MT at hand to perform this measurements.
  • Changes decired on the meeting will be reflected in the JIRA tasks
 
 
31th of January
  • Present: Marco Cl., Danilo, Ric, Pere
  • Review the golas of the meeting of next week.
  • LHCb: Investigations of OpenGL and the Gaudi framework.  
  • Status of GaudiHive;
  • Pere: The read&write test is now working by having intorduced 'resources' in the scheduling to prevent the read and the write algorithm to run concurrently.
  • Danilo: Making progress of the AlgScheduler. The implementation will include a dedicated thread to execute 'lambdas' from a queue.
  • Ideas of future works items:
  • Parallelization of the initialization sequence (using tasks and futures). 
  •  
 
 
10th of January
  • Present: Marco Cl., Benedikt, Pere, Markus, Rolf, 
  • Workflow of ATLAS
  • Benedikt: after removing the inconsistencies he gets a speedup of 8, which is not realistic.
  • Charles is working on it.
  • Development of GaudiHive
  • New WhiteBoard service implementation based on the multi-event data store. It is using thread local storage to keep the event slot in all the operations
  • New requirements for AlgPool to include 'resources' that can not be shared. This is to avoid for example two algorithms doing ROOT I/O at the same time.
  • SignalSvc (events or incidents) to drive all the concurrent interactions.This triggers a lot of discussion:
  • Exercise the use of 'lambdas' in the SignalSvc
  • Decised same interface and different implementation depending on the weight. 
  •  
  •  
 
6th of December
  • Present: Marco Cl., Benedikt, Pere, Illya, Markus
  • Marco Cl. updated the LHCb stack and updated with tagged versions (CMake goes the git checkout). 
  • Benedikt reports on the progress in Mini-Brunel
  • Working on the input data
  • Athena workflow: several attempts. The second iteration is much better (speedup is factor 8). Loops and updates.
  • Discussion on how to deal with update.
  • Don't do. Create a new product.
  • We worked out a solution using the 'control flow'. The consumer of the output data of an 'updater' will have a dependency added by the framework on the execution of the 'updater' algorithm.
  •  
 
 
22th of November
  • Present: Danilo, Markus, Marco Cl., Benedikt, Pere, Rolf, Sebastien
  • Danilo report mini-Brunel
  • Configuration simplification. Avoid the need for sequences. Not implemented in GaudiHive (not needed probably). Special sequences in Brunel....
  • Building system. Problems adding the special GaudiHive in to the existing instructions. Very good that we are able to build the complete stack.
  • Benedikt
  • Got the workflow from ATLAS. Difficulties to extract the dependencies. Some issues loops, disconnected graphs, etc. Rolf will provide a 'dot' file with the graph (no timinings probably)
  • Whiteboard implementation.
  • No string manipulation, needed to have ids or hashes. It can be done at the same time than the input and output are declared.
  • StringKey class in Gaudi already exists.
  • Discussion on cast or not to cast. 
  • Sebastien
  • Discussion on the slides on VarHandles
  • Marco Cl. argues that a pointer to the 'owner' algorithm is needed. Benedikt would like to see a redundancy.
  • Agreed that for next meeting we should come with a 'mockup' of the user interface for declaring, getting and putting data from and to the whiteboard.
  • example of use of VarHandles:
  •  
 
15th of November
  • Present: Illya, Danilo, Markus, Marco Cl., Benedikt, Pere
  • Danilo showed the latest plot on performance of GaudiHive. It basically recovers the results shown by Markus one year ago.
  • Pere introduced the intestest of Daniel Campora (LHCb) to work on the interface between new Gaudi and coprocessor like GPU.
  • Workshop in Fermilab.
  • Minimal Git repository getting copies to LHCb and Brunel. Instructions on how to build and run. The instructions are in the Twiki
  • Actions needed by Marco: flaten the algorithms, dependencies, output stream configuration
  • Benedikt will create an AFS volume for these developments
  • LHCb is going to release a new version of Gaudi. Marco would like to update the GaudiHive git.
  • TBB installation issues and interface to CMT discussed. There are worries that TBB depends on the version of Linux kernel.
  • HistogramSvc investigations: we need first monitor the usage of histograms before doing any work on it. Illya will eventually take care of looking into it.
  • Reasons for using StoreGate for ATLAS (from a mail of Sebastien)
  • No need to inherit from DataObject
  • No need for hierarchical structure
 
8th of November
  • Present: Pere, Sebastien (remote), Markus Frank, Marco Clemencic, 
  • Report of the ATLAS and LHCb meetings
  • Marco Cl. reporting the discussion wiith ATLAS people. Items of works for ATLAS members:
  • Charles could work on understanding the interface of StoreGate and the new Whiteboard. Also to collect the dependencies and timing of the atlas reconstruction
  • Sebastien could have a look at reading and writting n-tuple data. This will include parallel I/O.
  • What can be done in the LHCb side?
  • Pere proposed the historam service investigation and implementation of various prototypes to evaluate performance
  • Flow control discussion. Pere was proposing to add additional states in the algorithm to encapsulate the fact that an algorithm is not yet ready to be scheduled.
  • Mini-brunel. Enabling the Velo only would be a possibility according to Marco Cattaneo. ==> Marco Clemencic
 
18th of October
  • Present: Pere, Danilo, Illya Markus
  • News:
  • Commenting the ATLAS meeting: presentations from Wim, Gordon, etc. Locking services could be a solution for the time being on services that we do not have the time to re-work. 
  • Comments on LHCb HLT requirements. Control-flow is a must in this situation.
  • Preparation of the ATLAS talk for Friday.
 
11th of October
  • Present: Marco, Pere, Danilo, Illya, Benedikt, Markus
  • Review of actions:
  • Timing of Brunel. It going to be available tomorrow.
  • No news from Rolf to get the numbers from ATLAS; Hopefully today there will be some news. 
  • News:
  • ATLAS meeting next week. Danilo presentation incluing the status of other activities in CMS.
  • LHCb software week. Request on a summary of the concurrency activities and status of new Gaudi.
  • Discussion with CMS on histograms. It seems that a request to have thread-safe histograms will be made. We need to find out what is the situation in ATLAS. Is obvious that LHCb can also benefit. 
  • Progress:
  • Ported GaudiHive to the MacOSX. Few problems reported by Pere.
  • Brunel on CMake is now working. 5 minutes Gaudi,... less than 1 hour Brunel. Marco will provide a recipe for building Brunel (it requires a few patches and scripts). This evening the branches will be merge.
  • Discussion:
  • IncidentSvc issues:
  • Syncronization of them. The special case of detector condition updates. 
  • Incicents usage inventory is needed to analyse them and see what needs to be done.
  • DataOnDemand issues:
  • The solution is simple. It implies to transform the configuration information of the DataOnDemand and put the algorithms in the global list of algrithms to be executed on availability of data. 
  • The dataOnDemand also has a dynamic part dealing with some data <--> algorithm relation in terms of 'rules'. 
  •  
  •  
4th of October
  • Present: Marco, Pere, Danilo, Illya
  • Actions: 
  • Marco: Get the timing numbers and dependencies form Brunel. Delegated to Ben ... almost done
  • Benedikt: Rolf provised that the numbers from ATLAS will also be provided.
  • CMS numbers obtained from Chris. Benedikt is currently diggesting then.
  • Marco: Get the timing numbers and dependencies form Brunel. Delegated to Ben ... almost done
  • Benedikt: Rolf provised that the numbers from ATLAS will also be provided.
  • CMS numbers obtained from Chris. Benedikt is currently diggesting then.
  • Mini-Brunel no progress.
  • Progress. Discussions and ideas:
  • Data on demand. Analysed some use cases of LHCb. Data-on-demand is used to do schema-evolution and transformation of transient to persistent representations. The issue is how to schdedule using the forward and backward dependencies. 
  • Histogramming service. Transitions of runs and lumi-sections. 
  • Control flow issues. To be discuss later
  • Pere mentioned the plans to measure the relative performance of TLS versus object clonning.
  •  
 
 
27th of September
  • Present: Marco, Pere, Danilo, Benedikt
  • Need to collect better 'data' from Brunel. Event by event algorithm time (not only average)  and data dependencies.
  • Actions:  Marco to get the numbers, Benedikt get from CMS the actual format
  • Discussions on next steps
  • Mini Brunel (with 20-30 algorithms). This requires to build LHCb, LBCOM, REC and BRUNEL using the nexw version of Gaudi-Hive. It will be nice to use the latest CMake development that Marco is currently working (1 month time scale)
  • Meanwhile study the interaction of Tools (private and public)
 
20th of September
  • Present: Marco, Pere, Danilo, Markus, Benedikt
  • Current status of CF4Hep (Hive). 
  • AlgorithmContextSvc has been removed (ill defined concept). Agreed. Need to find the consequences for LHCb
  • CpuCruncherAlg: ( same as BusyWaitAlgorithm) using some randomness. 
  • Building a concurrent scheduler. First step with concurrency within one event (using Markus numbers of Brunel)
  • Items of work: ref counting neede to be thread-safe; ....
  • Multi-event case. 
  • Started to discuss the implementation of DataSvc with the real whiteboard.  Agreement that a single 'DataSvc' to hold several events is best. 
  •  
  • C++11 discussion
  • Take advantage of it? The current use could be replaced by other means (such as TBB). We  should be looking forward and code to the future.
  • Marco is willing to sell to LHCb C++11 in order to gain performance (the std library is taking advantage of it)
  • Meetings presentation
  • ....
 
23rd August 2012
  • Present: Benedikt, Danilo, Audrius, Pere
  • GaudiBusySleepTool: 
  • Tested by Danilo. It works with the exception of the error  "The algorithm stack is invalid". Probably the service 'AlgContextSvc'  needs to be disabled.
  • BusyWaitAlgorithm:
  • Neds to get as configuration the input items, output items, and busy time.
  • Immediate plans:
  • Start with a sub-event concurrency on single events.
  • Started to draw the design of services and algorithms in the blackboard.
 
9th of August 2012
(Some movements mentioned at the informal meeting)
 
  • Present: Marco, Audrius, Illya, Pere, Danilo
 
From Illya and Marco:
  • The GaudiParallelizer is ported to TBB library (so GaudiMC git now contains two back-end implementations of the parallelizer, QuickThread can be added easily as well);
  • The 'GaudiBusySleepTool' tool is implemented (but not yet pushed to GaudiMC master git). Several investigations have been carried out with the tool using the TBB-fied GaudiParallelizer algorithm manager:
  • While using the tool concurrently from several algorithms the problem has been discovered: the 'AlgContextSvc' is crashing with the messages "The algorithm stack is invalid" and "Non-empty stack of algorithms #". For the note: the service has been implemented by ATLAS to control/monitor the algorithm's call stack. The service is not really execution-critical (at least for LHCb) so was turned off to allow further investigations (this 'switch off' is not yet pushed to the master git branch - TO BE decided how do we want to bypass this).
  • The generation of several CLHEP distributions was tried out as the core of the busy computation in the tool, and for all of them the problem has occurred: there is the per-algorithm execution time explosion for such computation loads in the parallel mode ( from x7 to x25 times more than in the sequential mode). Several explanations have been rolled out: TO BE investigated.
  • Instead, dummy busy computation composed of nested math functions behaves as expected: the per-algorithm execution time in parallel mode is the same as in sequential one, and the total time of algorithms execution experience almost ideal speedup.
 
19 July 2012
 
  • Present: Marco, Audrius, Benedikt, Illya, Pere, Danilo
  •  Accounts to Marco for the GIT repository.Pere, Benedikt,... Good experience from Pere
  • Building GaudiMC. Experiences:
  • Benedikt: compile on LXPLUS seems fine
  • Pere: still problems with MacOSX
  • Installation of externals: TBB is already there. Still problems with libdispatch.
  • Running the tests is still a bit difficult. Solutions exsists for eclipse but not for Xcode. 
  • Audrius: transactional memory works sometimes but not always. We need to dig a little bit. 
  • Discussion on transactioal memory. More clear code to do what we could do with locks. Potential for hardware support.
  • Holidays. We skip two weeks. 
  • Actions: 
  • Make GaudiMC working on MacOSX. 
  • Introduce canonical data flow in GaudiExamples ==> Add JIRA task by Benedikt executed by Marco
  • Study the posibility of developing 'busy code'. 
  • e-group for the project (done)
 
12 July 2012
  • Present: Marco, Audrius, Benedikt, Illya, Pere, Sebastien[remote]
  • GIT repository. Copied Gaudi to GaudiMC in AFS
  • It is possible fetching changes from one URL and pushing to another URL
  • e-group ?? cannot be done. The e-group is still needed for the mail.
  • The instructions were send and should work (attached here)
  • Needed externals: TBB, ... are already in the external area. In addition, there is a new directory 'cmaketools'. A new SVN repository should be setup. Is on the way...
  •  
  • Audrius testing 'transactions' with TBB.
  • Illya has been testing the Intel VTune Amplifier XE with Gaudi. He found 15 active threads in the 'standard' Gaudi!! this is without adding the new TBB stuff. One should look in more detail at the origin of these threads.
  • MessageSvc implementation discussion: blocking and non-blocking tasks, different ways to 'schedule' the receiver tasks.
  • Sebastien understanding STM in the context of Go. 
  • gaudi-dev/manycore instructions:
===
# set up the minimal environment
setenv CMTCONFIG x86_64-slc5-gcc46-opt
setenv CMTPROJECTPATH /afs/cern.ch/sw/lcg/app/releases
setenv PATH /afs/cern.ch/sw/lcg/external/CMake/2.8.6/$CMTCONFIG/bin:$PATH
source /afs/cern.ch/sw/lcg/external/gcc/4.6.2/x86_64-slc5/setup.csh
 
# clone
cd Gaudi
# configure for push via ssh
git remote set-url --push origin ssh://lxplus.cern.ch/afs/cern.ch/sw/Gaudi/git/GaudiMC.git
 
# the default branch is 'dev/manycore', which includes the implementation of the special Message service
 
# build
./configure
make -j 20 -C build-dir install
 
# useful alias (to be improved)
alias gaudirun "python2.6 $PWD/cmake/env.py -x $PWD/InstallArea/$CMTCONFIG/GaudiEnvironment.xml  gaudirun.py"
 
# run an example
cd GaudiExamples/options
gaudirun GaudiCommonTests.opts
 
===
 
Actions:
  • Benedikt: Create e-group. Send accounts to Marco for the GIT repository.
  • All: Follow the instructions for building and running GaudiMC
  • Marco and Illya: Try an implementation of MessageSvc with libdispatch.
  • Benedikt: installation of externals (libdispatch)
 
 
Audio Conference Details
Dial-in numbers:+41227676000 (English-US, Main)
Access codes:  0168656 (Leader)
                       0115919 (Participant)
 
5 July 2012
  • Present: Marco, Audrius, Benedikt, Illya, Danilo, Pere
  • Introduced student.
  • GIT discussion. No service in IT (thinking about it). We go with the AFS implementation. Marco will make the setup. 
  • Create an e-group for repository access and mailing --> Benedikt 
  • Gaudi tutorial. Difficult to find the old introduction slides. Instructions for running gaudi sent by Marco.
  • Intel tutorial. Good Vtune tool to keep in mind when doing the dvelopment
  • QuickThread discussion
  • Is it a fair comparison with TBB? The document is from the QuickThread people.
  • TBB will also come very probably with an I/O bound task. The idea of a I/O bound task is good. Can we overcome by spawning raw threads. How many 'waiting' threads are going to be in a typical application? The ROOT I/O example is probably 1 for input and 1 for output. Order 10 threads.
  • The 'scheduling hints' based on NUMA requirements is also good,
  • Drawbacks: License, static library only distribution, ...
  • Marco update on MessageSvc
  • Generic SerialQueue implementation with TBB
  • There is a pattern in the TBB documentation that impements a serial task queue (or sequential)
  • Plans of Holidays
  • ...
Actions:
  • Benedikt make available the module FindTBB.cmake. A build of TBB and libdispatch need to be installed
  • Setup the GIT on AFS and distrubute instructions
  • Illya will circulate the 'secret page' of the Gaudi Tutorial.
 
28 June 2012
  • Present: Danilo, Marco, Illya, Markus, Benedikt, Pere
  • Marco explained the work he has been doing with the implementation of a MessageSvc supporting concurrency based on TBB. 
  • The main difficulty being that TBB does not provide a 'synchronous' dispatch queue as it is the case with libdispatch. He has worked out a solution that is not wasting a 'task' waiting in the receiver end of the message queue.
  • Copying the message from the 'algorithm task' to queue may be costly. C++11 may come at rescue with the 'move semantics'. Discussed that we should be aiming for C++11 for this work.   
  • For building with CMake he was missing FindTBB.cmake
  • Discussion on repositories.
  • Decided to start from the Gaudi reporsitory. Marco explained the current situation of the various 'masters' in SVN and GIT and their synchronization.
  • Decided to create a new clone of Marco's GIT repository to avoid any interference with production version of Gaudi. This can either be in IT or implemented in AFS. People will have write  access to this clone.
  • Discussion on the project name. Agreed for the being to call it CF4hep (Concurrent Framework for HEP)
  • Ilya proposed to have a look at QuickThread library that implements a task concurrency model that is also adequate for I/O bounded tasks (waiting tasks). 
  • There is a document in there comparing QT with TBB, do not miss the Memory Allocation and Task Scheduler paragraphs at its end.
  • Plus some express introduction articles:
  •  
Actions:
  • Benedikt make available the module FindTBB.cmake
  • Benedikt will ask IT about the status of a possible GIT service.
  • For the people less familiar with Gaudi, Marco will send few pointers with tutorial material. 
  • People should have a look at the QuickThread documentation to discuss it at next meeting.
 
21 June 2012
 
  •  Present: Danilo, Marco, Ilya, Markus, Benedikt, Sebastien, Pere
  •  (Sebastien) Task force created in ATLAS with offliners and onliners to study MT issues. Tomorrow first meeting. Online/trigger interested in OpenCL and TBB. First goal is how to ''sandbox' an algorithm to support parallel execution (sub-algorithm parallelization). E.g. support for  MT messaging.
  • Discussion on concurrency model implementation libraries and interfaces. Decided not to create artificial interfaces. The framework itself offers a good interface to specific high level functionality. So, framework service implementations ought to  be able to use directly the implementation libraries (e.g TBB,  libdispatch).
  •  Discussion on the possible problems with a MessageSvc. Used by concurrent algorithms, and use within a parallel algorithm implementation. Focus for the time being on concurrent algorithms only (leave MsgStream as it is basically).
  •  Discussion on code analysis tools. Not clear that the current plugins for the clang static code analyzer to locate statics, const casts, and mutable will give us a great deal of light in the problems that we will encounter. Important to be able to code the knowledge when we will be finding real case problems.   
  •  Discussion on code repositories. For the time being we are relaxed on what to use for small prototypes. Later we should be using the one provided by IT (SVN today) 
 
Actions:
  • Marco will start working on a concurrent MessageSvc implementation with TBB. Basically the same interface as existing GAUDI one, delay write (MT protection), spying on messages, storing some key to allow off-line sorting by thread/algorithm, etc.
  • Benedikt/Riccardo will show the current status of the WB prototype at the next meeting.
  • Danilo will circulate  information of the clang analyzer and an example code.
  • Using the "scan-build" tool allows indeed to make the syntax check a byproduct of the compilation, which by default is carried out with gcc. 
  • Benedikt will setup a new JIRA project to keep track of tasks and actions.
  • Sebastien mentioned code refactoring based on clang. He will circulate some pointers.
 
14 June 2012
Meeting cancelled due to the ATLAS software week meeting. 
 
7 June 2012
  • Present: Danilo, Marco, Illya, Markus, Benedikt, Pere
  • Discussed the main goals of the project. 
  • Development of few framework components to support concurrency. 
  • We are going to do it in the context of Gaudi framework just not to start from scratch.
  • Discussed some of them: concurrent whiteboard, algorithm scheduler, algorithm pool, message service, etc. So far the detector description will be treated as 'const' with a single copy. 
  • Decided to write notes of the meetings.