Alfresco auditing and Site access reporting
This blog discusses an approach to reporting on site access and activity. We previously looked at using the AuditShare add-on, of which we have been quite impressed. However we found that it did not perform well in an environment with thousands of users. The solution we determined best matched the Alfresco environment for our customer includes the following aspects each of which will be discussed in this and also our next 2 blogs:
- Alfresco Auditing was used to trap any access to content in a custom Audit Application.
- A scheduled job was created to query the audit and report on access for each site.
- The report was created monthly after which the audit entries were removed.
Alfresco Auditing
Enable Auditing
It is necessary to enable auditing for the system by adding the following lines in the alfresco-global.properties file. Add the following entries:
audit.enabled=true
audit.alfresco-access.enabled=true
audit.alfresco-access.sub-actions.enabled=false
This enables audit and also enables the audit data producer called alfresco-access. The alfresco-access audit traps high level entries for a user action. For example , when a user creates a node, it will trap one high level audit transaction called CREATE. If you need to audit at a lower level then it is also possible to use the audit API data producer to trap specific calls to alfresco api calls.
Create Custom Audit Application
Alfresco allows you to configure an audit application to determine the values you want to record when an audit event occurs. In our case we need to trap the site name, user and content accessed and event type with each event. Having your own custom application means you can also manage deletion of audit entries for the application separately to other audit entries. This means we can report against them monthly and then clear the audit entries once the report has been created. It also means that if you need to maintain your audit beyond the month then entries for alfresco-access application are maintained.
The audit application was created by adding the alfresco-audit-site.xml to tomcat/shared/classes/alfresco/extension/audit folder. The file includes the following xml:
<Audit xmlns=http://www.alfresco.org/repo/audit/model/3.2 xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance xsi:schemaLocation=”http://www.alfresco.org/repo/audit/model/3.2 alfresco-audit-3.2.xsd”>
<DataExtractors>
<DataExtractor name=”simpleValue” registeredName=”auditModel.extractor.simpleValue”/>
<DataExtractor name=”siteName” class=”com.seedim.audit.dataextractor.SiteNameDataExtractor”/>
<DataExtractor name=”nodeName” registeredName=”auditModel.extractor.nodeName” />
<DataExtractor name=”nodeType” registeredName=”auditModel.extractor.nodeType” />
</DataExtractors>
<DataGenerators>
<DataGenerator name=”personFullName” registeredName=”auditModel.generator.personFullName”/>
</DataGenerators>
<PathMappings>
<PathMap source=”/alfresco-access” target=”/siteaccess” />
</PathMappings>
<Application name=”siteaccess” key=”siteaccess”>
<RecordValue key=”access” dataExtractor=”simpleValue” dataSource=”/siteaccess/transaction/action” dataTrigger=”/siteaccess/transaction/action”/>
<RecordValue key=”user” dataExtractor=”simpleValue” dataSource=”/siteaccess/transaction/user” dataTrigger=”/siteaccess/transaction/user” />
<RecordValue key=”path” dataExtractor=”simpleValue” dataSource=”/siteaccess/transaction/path” dataTrigger=”/siteaccess/transaction/path” />
<RecordValue key=”site” dataExtractor=”siteName” dataSource=”/siteaccess/transaction/path” dataTrigger=”/siteaccess/transaction/path” />
</Application>
</Audit>
DataExtractors and DataGenerators
These are used to parse the audit entries into values so that the RecordValue and GenerateValue parts of the Audit Application can present them. We created a custom DataGenerator called siteName to extract the site name from the path value. To create a custom extractor we need to extend the AbstractDataExtractor class. A snippet from our custom class shows the extractData method which receives the passed in value and then returns the sitename.
public Serializable extractData(Serializable in) throws Throwable{
String path = (String) in;
String siteName = “cm:Repository”; // default to repository for content not associated with sites.
if (path.contains(“st:sites”))
{
siteName = StringUtils.substringBetween(path, “/st:sites/”, “/”);
if (logger.isDebugEnabled())
{
logger.debug(“SiteName: ” + siteName);
}
}
return siteName;
}
One thing to note with extractors is that you need to pass in a value, the alfresco-access audit entries do not include a noderef so we chose to pass the path into the extractor so we could get the site id. Package the java class into a jar file and place it in /tomcat/webapps/alfresco/WEB-INF/lib. To register the extractor you can just list the class (our approach) that you created or alternatively register it as a bean if you need to inject in a Service Bean such as SiteService or NodeService.
Registering the extractor
The custom data extractor is registered by adding the following entry in our alfresco-audit-site.xml
<DataExtractor name=”siteName” class=”com.seedim.audit.dataextractor.SiteNameDataExtractor”/>
Calling the extractor to get the site name:
<RecordValue key=”site” dataExtractor=”siteName” dataSource=”/siteaccess/transaction/path” dataTrigger=”/siteaccess/transaction/path” />
Path Mappings
We are a little unclear on how mappings work. From our work we determined that the following mapping would take the entries from alfresco-access audit and map them to our application under the key of siteaccess. The key can then be used when querying the audit.
<PathMappings>
<PathMap source=”/alfresco-access” target=”/siteaccess” />
</PathMappings>
The Audit Application
The definition of the audit application is relatively straight forward. We name the application and then specify what values to record using our data extractors and data generators. The snippet below shows the xml element to name the application with a key of siteaccess (used for querying the application as explained above). The RecordValue element says which key to use when writing to our audit entry and which extractor to use to get our value. The snippet below also shows a RecordValue using our custom data extractor for site name.
<Application name=”siteaccess” key=”siteaccess”>
<RecordValue key=”access” dataExtractor=”simpleValue” dataSource=”/siteaccess/transaction/action” dataTrigger=”/siteaccess/transaction/action”/>
<RecordValue key=”site” dataExtractor=”siteName” dataSource=”/siteaccess/transaction/path” dataTrigger=”/siteaccess/transaction/path” />
Checking that our Audit Application is working
In order to test our custom Audit Application is doing what we require of it, we used the OOTB audit webscript to get a json response of all the audit entries generated and extracted by our custom Audit Application. The following webscript was used with siteaccess being the name of our Audit Application.
http://hostname:8080/alfresco/service/api/audit/query/siteaccess?verbose=true&limit=0
This returned entries with the following json content for each audit entry created:
{
“count”:30,
“entries”:
[
{
“id”:22095,
“application”:”siteaccess”,
“user”:”EXT_SEEDIM”,
“time”:”2014-09-02T11:02:36.875+10:00″,
“values”:
{
“\/siteaccess\/path”:”\/app:company_home\/st:sites\/cm:surf-config\/cm:module-deployments\/cm:Customise Share Menu.xml”
,”\/siteaccess\/access”:”UPDATE CONTENT”
,”\/siteaccess\/user”:”EXT_SEEDIM”
,”\/siteaccess\/site”:”cm:surf-config”
}
}
Conclusion
This concludes the blog about enabling auditing, creating a custom Audit Application, creating a custom data extractor and testing the audit application is working. In our next blog we will discuss the creation of a custom scheduled job that queries the audit service and reports on access for each share site.