Matches in Nanopublications for { ?s <http://purl.org/dc/terms/abstract> ?o ?g. }
Showing items 1 to 19 of
19
with 100 items per page.
- access.2023.3269660 abstract "Topic modeling comprises a set of machine learning algorithms that allow topics to be extracted from a collection of documents. These algorithms have been widely used in many areas, such as identifying dominant topics in scientific research. However, works addressing such problems focus on identifying static topics, providing snapshots that cannot show how those topics evolve. Aiming to close this gap, in this article, we describe an approach for dynamic article set analysis and classification. This is accomplished by querying open data of notable scientific databases via representational state transfers. After that, we enforce data management practices with a dynamic topic modeling approach on the associated metadata available. As a result, we identify research trends for a given field at specific instants and the referred terminology trends evolution throughout the years. It was possible to detect the associated lexical variation over time in published content, ultimately determining the so-called “hot topics” in arbitrary instants and how they correlate." assertion.
- access.2023.3269660 abstract "Topic modeling comprises a set of machine learning algorithms that allow topics to be extracted from a collection of documents. These algorithms have been widely used in many areas, such as identifying dominant topics in scientific research. However, works addressing such problems focus on identifying static topics, providing snapshots that cannot show how those topics evolve. Aiming to close this gap, in this article, we describe an approach for dynamic article set analysis and classification. This is accomplished by querying open data of notable scientific databases via representational state transfers. After that, we enforce data management practices with a dynamic topic modeling approach on the associated metadata available. As a result, we identify research trends for a given field at specific instants and the referred terminology trends evolution throughout the years. It was possible to detect the associated lexical variation over time in published content, ultimately determining the so-called “hot topics” in arbitrary instants and how they correlate." assertion.
- gigabyte.99 abstract "In China, 65 types of venomous snakes exist, with the Chinese Cobra Naja atra being prominent and a major cause of snakebites in humans. Furthermore, N. atra is a protected animal in some areas, as it has been listed as vulnerable by the International Union for Conservation of Nature. Recently, due to the medical value of snake venoms, venomics has experienced growing research interest. In particular, genomic resources are crucial for understanding the molecular mechanisms of venom production. Here, we report a highly continuous genome assembly of N. atra, based on a snake sample from Huangshan, Anhui, China. The size of this genome is 1.67 Gb, while its repeat content constitutes 37.8% of the genome. A total of 26,432 functional genes were annotated. This data provides an essential resource for studying venom production in N. atra. It may also provide guidance for the protection of this species." assertion.
- DS-230059 abstract "Measuring data drift is essential in machine learning applications where model scoring (evaluation) is done on data samples that differ from those used in training. The Kullback-Leibler divergence is a common measure of shifted probability distributions, for which discretized versions are invented to deal with binned or categorical data. We present the Unstable Population Indicator, a robust, flexible and numerically stable, discretized implementation of Jeffrey's divergence, along with an implementation in a Python package that can deal with continuous, discrete, ordinal and nominal data in a variety of popular data types. We show the numerical and statistical properties in controlled experiments. It is not advised to employ a common cut-off to distinguish stable from unstable populations, but rather to let that cut-off depend on the use case." assertion.
- DS-240059 abstract "Measuring data drift is essential in machine learning applications where model scoring (evaluation) is done on data samples that differ from those used in training. The Kullback-Leibler divergence is a common measure of shifted probability distributions, for which discretized versions are invented to deal with binned or categorical data. We present the Unstable Population Indicator, a robust, flexible and numerically stable, discretized implementation of Jeffrey's divergence, along with an implementation in a Python package that can deal with continuous, discrete, ordinal and nominal data in a variety of popular data types. We show the numerical and statistical properties in controlled experiments. It is not advised to employ a common cut-off to distinguish stable from unstable populations, but rather to let that cut-off depend on the use case." assertion.
- DS-240059 abstract "Measuring data drift is essential in machine learning applications where model scoring (evaluation) is done on data samples that differ from those used in training. The Kullback-Leibler divergence is a common measure of shifted probability distributions, for which discretized versions are invented to deal with binned or categorical data. We present the Unstable Population Indicator, a robust, flexible and numerically stable, discretized implementation of Jeffrey's divergence, along with an implementation in a Python package that can deal with continuous, discrete, ordinal and nominal data in a variety of popular data types. We show the numerical and statistical properties in controlled experiments. It is not advised to employ a common cut-off to distinguish stable from unstable populations, but rather to let that cut-off depend on the use case." assertion.
- DS-240059 abstract "Measuring data drift is essential in machine learning applications where model scoring (evaluation) is done on data samples that differ from those used in training. The Kullback-Leibler divergence is a common measure of shifted probability distributions, for which discretized versions are invented to deal with binned or categorical data. We present the Unstable Population Indicator, a robust, flexible and numerically stable, discretized implementation of Jeffrey's divergence, along with an implementation in a Python package that can deal with continuous, discrete, ordinal and nominal data in a variety of popular data types. We show the numerical and statistical properties in controlled experiments. It is not advised to employ a common cut-off to distinguish stable from unstable populations, but rather to let that cut-off depend on the use case." assertion.
- a97f-egyk abstract "The Data Citation Principles cover purpose, function and attributes of citations. These principles recognize the dual necessity of creating citation practices that are both human understandable and machine-actionable." assertion.
- FAIRsharing.yknezb abstract "DataCite is a leading global non-profit organisation that provides persistent identifiers (DOIs) for research data. Their goal is to help the research community locate, identify, and cite research data with confidence. They support the creation and allocation of DOIs and accompanying metadata. They provide services that support the enhanced search and discovery of research content. They also promote data citation and advocacy through community-building efforts and responsive communication and outreach materials. DataCite gathers metadata for each DOI assigned to an object. The metadata is used for a large index of research data that can be queried directly to find data, obtain stats and explore connections. All the metadata is free to access and review. To showcase and expose the metadata gathered, DataCite provides an integrated search interface, where it is possible to search, filter and extract all the details from a collection of millions of records." assertion.
- sciencedb.00012. abstract "Archival Resource Keys (ARKs) are another widely used persistent identifier, supported by the California Digital Library, in collaboration with DuraSpace. ARKs work similarly to DOIs, but are more permissive in design." assertion.
- sciencedb.00013. abstract "N2T.net (Name-to-Thing) is a 'resolver', a kind of web server that stores little content itself and usually forwards incoming requests to other servers. Similar to URL shorteners like bit.ly, N2T serves content indirectly." assertion.
- sciencedb.00013. abstract "N2T.net (Name-to-Thing) is a 'resolver', a kind of web server that stores little content itself and usually forwards incoming requests to other servers. Similar to URL shorteners like bit.ly, N2T serves content indirectly." assertion.
- sciencedb.00014. abstract "EZID is a service and a platform provided by the California Digital Library for creating, registering, and managing persistent identifiers for scholarly research and cultural heritage objects, including but not limited to articles, datasets, images, and specimens." assertion.
- FAIRsharing.hFLKCn abstract "The digital object identifier (DOI) system originated in a joint initiative of three trade associations in the publishing industry (International Publishers Association; International Association of Scientific, Technical and Medical Publishers; Association of American Publishers). The system was announced at the Frankfurt Book Fair 1997. The International DOI Foundation (IDF) was created to develop and manage the DOI system, also in 1997. The DOI system was adopted as International Standard ISO 26324 in 2012. The DOI system implements the Handle System and adds a number of new features. The DOI system provides an infrastructure for persistent unique identification of objects of any type. The DOI system is designed to work over the Internet. A DOI name is permanently assigned to an object to provide a resolvable persistent network link to current information about that object, including where the object, or information about it, can be found on the Internet. While information about an object can change over time, its DOI name will not change. A DOI name can be resolved within the DOI system to values of one or more types of data relating to the object identified by that DOI name, such as a URL, an e-mail address, other identifiers and descriptive metadata. The DOI system enables the construction of automated services and transactions. Applications of the DOI system include but are not limited to managing information and documentation location and access; managing metadata; facilitating electronic transactions; persistent unique identification of any form of any data; and commercial and non-commercial transactions. The content of an object associated with a DOI name is described unambiguously by DOI metadata, based on a structured extensible data model that enables the object to be associated with metadata of any desired degree of precision and granularity to support description and services. The data model supports interoperability between DOI applications. The scope of the DOI system is not defined by reference to the type of content (format, etc.) of the referent, but by reference to the functionalities it provides and the context of use. The DOI system provides, within networks of DOI applications, for unique identification, persistence, resolution, metadata and semantic interoperability." assertion.
- DSP4FAIR abstract "Data Stewardship Plan (DSP) templates prompt users to consider various issues but typically have no requirements for actual implementation choices. But as FAIR methodologies mature the DSP will become a more directive “how to” manual for making data FAIR." assertion.
- DSP4FAIR abstract "Data Stewardship Plan (DSP) templates prompt users to consider various issues but typically have no requirements for actual implementation choices. But as FAIR methodologies mature the DSP will become a more directive “how to” manual for making data FAIR." assertion.
- DS-230058 abstract "Sarcasm is a linguistic phenomenon often indicating a disparity between literal and inferred meanings. Due to its complexity, it is typically difficult to discern it within an online text message. Consequently, in recent years sarcasm detection has received considerable attention from both academia and industry. Nevertheless, the majority of current approaches simply model low-level indicators of sarcasm in various machine learning algorithms. This paper aims to present sarcasm in a new light by utilizing novel indicators in a deep weighted average ensemble-based framework (DWAEF). The novel indicators pertain to exploiting the presence of simile and metaphor in text and detecting the subtle shift in tone at a sentence’s structural level. A graph neural network (GNN) structure is implemented to detect the presence of simile, bidirectional encoder representations from transformers (BERT) embeddings are exploited to detect metaphorical instances and fuzzy logic is employed to account for the shift of tone. To account for the existence of sarcasm, the DWAEF integrates the inputs from the novel indicators. The performance of the framework is evaluated on a self-curated dataset of online text messages. A comparative report between the results acquired using primitive features and those obtained using a combination of primitive features and proposed indicators is provided. The highest accuracy of 92% was achieved after applying DWAEF, the proposed framework which combines the primitive features and novel indicators together as compared to 78.58% obtained using Support Vector Machine (SVM) which was the lowest among all classifiers." assertion.
- DS-220058 abstract "Sarcasm is a linguistic phenomenon often indicating a disparity between literal and inferred meanings. Due to its complexity, it is typically difficult to discern it within an online text message. Consequently, in recent years sarcasm detection has received considerable attention from both academia and industry. Nevertheless, the majority of current approaches simply model low-level indicators of sarcasm in various machine learning algorithms. This paper aims to present sarcasm in a new light by utilizing novel indicators in a deep weighted average ensemble-based framework (DWAEF). The novel indicators pertain to exploiting the presence of simile and metaphor in text and detecting the subtle shift in tone at a sentence’s structural level. A graph neural network (GNN) structure is implemented to detect the presence of simile, bidirectional encoder representations from transformers (BERT) embeddings are exploited to detect metaphorical instances and fuzzy logic is employed to account for the shift of tone. To account for the existence of sarcasm, the DWAEF integrates the inputs from the novel indicators. The performance of the framework is evaluated on a self-curated dataset of online text messages. A comparative report between the results acquired using primitive features and those obtained using a combination of primitive features and proposed indicators is provided. The highest accuracy of 92% was achieved after applying DWAEF, the proposed framework which combines the primitive features and novel indicators together as compared to 78.58% obtained using Support Vector Machine (SVM) which was the lowest among all classifiers." assertion.
- DS-220058 abstract "Sarcasm is a linguistic phenomenon often indicating a disparity between literal and inferred meanings. Due to its complexity, it is typically difficult to discern it within an online text message. Consequently, in recent years sarcasm detection has received considerable attention from both academia and industry. Nevertheless, the majority of current approaches simply model low-level indicators of sarcasm in various machine learning algorithms. This paper aims to present sarcasm in a new light by utilizing novel indicators in a deep weighted average ensemble-based framework (DWAEF). The novel indicators pertain to exploiting the presence of simile and metaphor in text and detecting the subtle shift in tone at a sentence’s structural level. A graph neural network (GNN) structure is implemented to detect the presence of simile, bidirectional encoder representations from transformers (BERT) embeddings are exploited to detect metaphorical instances and fuzzy logic is employed to account for the shift of tone. To account for the existence of sarcasm, the DWAEF integrates the inputs from the novel indicators. The performance of the framework is evaluated on a self-curated dataset of online text messages. A comparative report between the results acquired using primitive features and those obtained using a combination of primitive features and proposed indicators is provided. The highest accuracy of 92% was achieved after applying DWAEF, the proposed framework which combines the primitive features and novel indicators together as compared to 78.58% obtained using Support Vector Machine (SVM) which was the lowest among all classifiers." assertion.