Elena Canorea
Communications Lead
The main focus of this release has been the full migration of .NET Core 3.2 to .Net 6 in Sidra.
However, this release also includes important functional improvements. The most significant of them is the support for schema evolution in source systems of type database. While previous versions of Sidra already detected changes to the data source table schemas and notified the user of these changes, this release brings the support for fully automated schema amendment on the DSU table, adding new Attributes and Entities when they appear in the data source.
Knowledge Store ingestion performance and operational improvements have also been included here to accomplish different scenarios derived from the binary file ingestion, bringing the knowledge store to a fully supported scenario now.
Besides all these changes, some other technical evolution topics have been included in this release, including the update of Databricks cluster runtime to the latest long term support (LTS) version 10.4, and the migration of all the CI/CD Azure DevOps release pipelines to support the current windows-2022 agent.
Sidra also continues to consolidate the operational improvements around Client Applications deployment and plugin management.
One of the biggest changes included in this release has been the full migration of .NET version from 3.2 to .NET version 6. This was a critical and much needed effort, since Sidra was using .Net Core 3.2, which was going to go out EOL by the end of 2022. By moving to .Net 6 we are not just reaping the benefits of the newer platform, but with this new version being LTS, we are ensuring supportability until November 2024. This migration work has impacted all Core, DSU and Client Apps modules, including:
The Agent Pool in Sidra Azure DevOps Release pipeline has been upgraded to windows-2022, affecting both Core deployment and Client Apps pipelines. You can check more information about this technology evolution in this link.
In this release Sidra incorporates a new feature intended to automatically evolve the schema of the source tables, meaning that, whenever new columns of data are added, new Attributes will be created in Sidra Core metadata for each new column added recently.
Following the Sidra process, new columns will be created in their respective Databricks tables for that Entity. Then, once in each execution of the data extraction pipeline, Assets will be created including these new columns in Databricks. Also, whenever new tables are added in the source system, if these tables are configured to be included in the data extraction (in the metadata extraction options of the Data Intake Process configuration), new Entities will be created for these added tables and associated to the data extraction pipeline, so they can be included in the data extraction set.
For more details, you can see the related schema evolution documentation.
The runtime version of Databricks cluster for DSU resource groups has been upgraded to the latest long term stable (LTS) version that Databrick has released. You can check this information in Azure Databricks documentation.
A more robust mechanism to manage multiple Client Application pipelines per Entity has been included.
This has been done through a couple of improvements on the Client Applications side:
This feature consists of a few enhancements on the Client Application side:
More information about this feature can be checked in the documentation site.
For this new version of Sidra, information about Client Applications from the Asset table in Sidra Core has been included such as row count (Entities), errors (Validation Errors) and byte sizes (ByteSize), in order to keep a detailed tracking of the whole process from the Client Application side. The addition of these controlled fields allows a configuration advanced logic for use cases on the Client Application side.
We have refactored the plugins code to move each plugin to its own release branch independent of Sidra release and other plugins release. Additionally, we have added a new registration endpoint for Sidra plugins, to ease the compatibility set up between each new plugin version and the Sidra version.
This feature complements the current capability of Knowledge Store ingestion in Sidra, to support scenarios where we need a semi-automated process to remove artifacts from the index, or prepare the environment for a re-indexing of the documents, avoiding the previously ingested documents from affecting the index. This scenario could happen whenever there is a change in the index structure, an update of the skill model, etc.
A new Core API endpoint has been developed, that orchestrates the different actions required to prepare all the moving pieces of Sidra binary file ingestion to remove old documents from documents index, and intermediate and final data ingestion artifacts (raw storage, index, Knowledge Store, Databricks tables).
This endpoint supports two basic modes. In both modes, the intermediate structures and Asset state for the given Assets ingested by a specific pipeline are removed from Core. The set of these artifacts that are always removed are the following:
With regards to what to do with the raw documents in the intermediate raw storage container, two different modes have been defined for this endpoint:
We are now suporting custom SMTP settings, allowing the use of a different SMTP server other than the default Sendgrid account. This feature was required in order to enable certain advanced scenarios for password recovery and other requirements from some of our customers and partners. There are a new set of optional parameters in the CLI to support this at installation/upgrade time. More information is detailed in the Sidra’s documentation site.
This release includes some performance improvements regarding the Knowledge Store ingestion in Sidra. Different settings involved in the different phases of the document indexing have been tested and fine-tuned for improving the performance of the document indexing process.
These settings are resource sizing, degree of parallelism, search units, as well as infrastructure deployment settings for the associated skills executed by the Azure Search skillset.
In addition, an option in Sidra CLI deployment has been added to configure whether the Azure Search service will be deleted and re-created or not. This is a non mandatory parameter, and it is setup false by default. This setting allows the deployment to recreate the Azure Search resource with a different tier.
As well as Sidra Data Platform follows a continuous development, the documentation of Sidra keeps improving in order to ease as much as possible the experience for the user. For that reason, now it includes a section specially dedicated to the Sidra API, showing the different API’s endpoints and their schemas depending on the section of interest.
Access to the complete list of resolved issues and relevant changes in Sidra 2022. R2, here.
The CLI tool contains a parameter on the command profile
that replaces three others in the create
option, making the configuration easier for the user.
No action is required. The parameters devOpsProjectUrl
, devOpsProject
and gitRepository
have been replaced by devOpsRepositoryUrl
parameter, the URL of the target Azure DevOps repository, where code and template files are to be copied. It comes in the format like https://dev.azure.com/organizationName/projectName/_git/repositoryName. This way, internally the URL will be split into three profile JSON properties. More information can be found in the profile
command page.
The CLI tool now contains a new parameter on the command deploy
, RecreateAzureSearch
, in the configure
option, making the configuration easier for the user.
No action is required.
RecreateAzureSearch parameter can have been setup as true from a last execution additionally to having experienced some change in the currentAzureSearchSku.
No action is required. However, please note that, when doing an upgrading, if the new setting is set to true and an sku change of the Azure Search service is required, this will trigger a full recreation of the service, and will require a full reindexing of the files. In this case, the existing index will be removed.
Because of safety purposes, we are forcing the setting of this parameter at the end of the CLI execution to false, to avoid unintended consequences. In the case that the user purposely wants to recreate Azure Search, it will have to be done manually.
This 2022.R2 release represents an important technological evolution effort to keep up to date with underlying platform improvements, namely .NET Core, Azure DevOps and Databricks.
Important steps have been taken on improving the operations of plugins and Client Application deployments.
Finally, accelerators for automatically detecting source database/tables changes have been developed, as part of the schema evolution features. Future schema evolution will incorporate changes to detect and accommodate deletions of tables/columns in source systems.
As part of the next release, we are planning to increase the plugin catalogue to create and edit Data Intake Processes from Sidra Web.
We would love to hear from you! You can make a product suggestion, report an issue, ask questions, find answers and propose new features at our Sidra ideas portal, or by reaching out to us in info@sidra.dev.
Elena Canorea
Communications Lead
Cookie | Duration | Description |
---|---|---|
__cfduid | 1 year | The cookie is used by cdn services like CloudFare to identify individual clients behind a shared IP address and apply security settings on a per-client basis. It does not correspond to any user ID in the web application and does not store any personally identifiable information. |
__cfduid | 29 days 23 hours 59 minutes | The cookie is used by cdn services like CloudFare to identify individual clients behind a shared IP address and apply security settings on a per-client basis. It does not correspond to any user ID in the web application and does not store any personally identifiable information. |
__cfduid | 1 year | The cookie is used by cdn services like CloudFare to identify individual clients behind a shared IP address and apply security settings on a per-client basis. It does not correspond to any user ID in the web application and does not store any personally identifiable information. |
__cfduid | 29 days 23 hours 59 minutes | The cookie is used by cdn services like CloudFare to identify individual clients behind a shared IP address and apply security settings on a per-client basis. It does not correspond to any user ID in the web application and does not store any personally identifiable information. |
_ga | 1 year | This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors. |
_ga | 1 year | This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors. |
_ga | 1 year | This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors. |
_ga | 1 year | This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors. |
_gat_UA-326213-2 | 1 year | No description |
_gat_UA-326213-2 | 1 year | No description |
_gat_UA-326213-2 | 1 year | No description |
_gat_UA-326213-2 | 1 year | No description |
_gid | 1 year | This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the wbsite is doing. The data collected including the number visitors, the source where they have come from, and the pages viisted in an anonymous form. |
_gid | 1 year | This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the wbsite is doing. The data collected including the number visitors, the source where they have come from, and the pages viisted in an anonymous form. |
_gid | 1 year | This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the wbsite is doing. The data collected including the number visitors, the source where they have come from, and the pages viisted in an anonymous form. |
_gid | 1 year | This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the wbsite is doing. The data collected including the number visitors, the source where they have come from, and the pages viisted in an anonymous form. |
attributionCookie | session | No description |
cookielawinfo-checkbox-analytics | 1 year | Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Analytics" category . |
cookielawinfo-checkbox-necessary | 1 year | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary". |
cookielawinfo-checkbox-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary". |
cookielawinfo-checkbox-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary". |
cookielawinfo-checkbox-necessary | 1 year | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary". |
cookielawinfo-checkbox-non-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Non Necessary". |
cookielawinfo-checkbox-non-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Non Necessary". |
cookielawinfo-checkbox-non-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Non Necessary". |
cookielawinfo-checkbox-non-necessary | 1 year | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Non Necessary". |
cookielawinfo-checkbox-performance | 1 year | Set by the GDPR Cookie Consent plugin, this cookie is used to store the user consent for cookies in the category "Performance". |
cppro-ft | 1 year | No description |
cppro-ft | 7 years 1 months 12 days 23 hours 59 minutes | No description |
cppro-ft | 7 years 1 months 12 days 23 hours 59 minutes | No description |
cppro-ft | 1 year | No description |
cppro-ft-style | 1 year | No description |
cppro-ft-style | 1 year | No description |
cppro-ft-style | session | No description |
cppro-ft-style | session | No description |
cppro-ft-style-temp | 23 hours 59 minutes | No description |
cppro-ft-style-temp | 23 hours 59 minutes | No description |
cppro-ft-style-temp | 23 hours 59 minutes | No description |
cppro-ft-style-temp | 1 year | No description |
i18n | 10 years | No description available. |
IE-jwt | 62 years 6 months 9 days 9 hours | No description |
IE-LANG_CODE | 62 years 6 months 9 days 9 hours | No description |
IE-set_country | 62 years 6 months 9 days 9 hours | No description |
JSESSIONID | session | The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application. |
viewed_cookie_policy | 11 months | The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data. |
viewed_cookie_policy | 1 year | The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data. |
viewed_cookie_policy | 1 year | The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data. |
viewed_cookie_policy | 11 months | The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data. |
VISITOR_INFO1_LIVE | 5 months 27 days | A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface. |
wmc | 9 years 11 months 30 days 11 hours 59 minutes | No description |
Cookie | Duration | Description |
---|---|---|
__cf_bm | 30 minutes | This cookie, set by Cloudflare, is used to support Cloudflare Bot Management. |
sp_landing | 1 day | The sp_landing is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content. |
sp_t | 1 year | The sp_t cookie is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content. |
Cookie | Duration | Description |
---|---|---|
_hjAbsoluteSessionInProgress | 1 year | No description |
_hjAbsoluteSessionInProgress | 1 year | No description |
_hjAbsoluteSessionInProgress | 1 year | No description |
_hjAbsoluteSessionInProgress | 1 year | No description |
_hjFirstSeen | 29 minutes | No description |
_hjFirstSeen | 29 minutes | No description |
_hjFirstSeen | 29 minutes | No description |
_hjFirstSeen | 1 year | No description |
_hjid | 11 months 29 days 23 hours 59 minutes | This cookie is set by Hotjar. This cookie is set when the customer first lands on a page with the Hotjar script. It is used to persist the random user ID, unique to that site on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID. |
_hjid | 11 months 29 days 23 hours 59 minutes | This cookie is set by Hotjar. This cookie is set when the customer first lands on a page with the Hotjar script. It is used to persist the random user ID, unique to that site on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID. |
_hjid | 1 year | This cookie is set by Hotjar. This cookie is set when the customer first lands on a page with the Hotjar script. It is used to persist the random user ID, unique to that site on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID. |
_hjid | 1 year | This cookie is set by Hotjar. This cookie is set when the customer first lands on a page with the Hotjar script. It is used to persist the random user ID, unique to that site on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID. |
_hjIncludedInPageviewSample | 1 year | No description |
_hjIncludedInPageviewSample | 1 year | No description |
_hjIncludedInPageviewSample | 1 year | No description |
_hjIncludedInPageviewSample | 1 year | No description |
_hjSession_1776154 | session | No description |
_hjSessionUser_1776154 | session | No description |
_hjTLDTest | 1 year | No description |
_hjTLDTest | 1 year | No description |
_hjTLDTest | session | No description |
_hjTLDTest | session | No description |
_lfa_test_cookie_stored | past | No description |
Cookie | Duration | Description |
---|---|---|
loglevel | never | No description available. |
prism_90878714 | 1 month | No description |
redirectFacebook | 2 minutes | No description |
YSC | session | YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages. |
yt-remote-connected-devices | never | YouTube sets this cookie to store the video preferences of the user using embedded YouTube video. |
yt-remote-device-id | never | YouTube sets this cookie to store the video preferences of the user using embedded YouTube video. |
yt.innertube::nextId | never | This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen. |
yt.innertube::requests | never | This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen. |