From Fedora Project Wiki
(Add tracker bug)
 
(27 intermediate revisions by 2 users not shown)
Line 2: Line 2:


== Summary ==
== Summary ==
The machine learning toolbox's focus is on large scale kernel methods and especially on Support Vector Machines (SVM).  It provides a generic SVM object interfacing to several different SVM implementations, among them the state of the art LibSVM and SVMlight.  Each of the SVMs can be combined with a variety of kernels.  The toolbox not only provides efficient implementations of the most common kernels, like the Linear, Polynomial, Gaussian and Sigmoid Kernel but also comes with a number of recent string kernels as e.g. the Locality Improved, Fischer, TOP, Spectrum, Weighted Degree Kernel (with shifts).  For the latter the efficient LINADD optimizations are implemented.  Also SHOGUN
SHOGUN is a large Scale Machine Learning Toolbox, being implemented in C++ and offering interfaces to C#, Java, Lua, Octave, Perl, Python, R and Ruby.
offers the freedom of working with custom pre-computed kernels.  One of its key features is the "combined kernel" which can be constructed by a weighted linear combination of a number of sub-kernels, each of which not necessarily working on the same domain.  An optimal sub-kernel weighting can be learned using Multiple Kernel Learning.  Currently SVM 2-class classification and regression problems can be dealt with.  However SHOGUN also implements a number of linear methods like Linear Discriminant Analysis (LDA), Linear Programming Machine (LPM), (Kernel) Perceptrons and features algorithms to train hidden Markov-models.  The input feature-objects can be dense, sparse or strings and of type int/short/double/char and can be converted into different feature types.  Chains of "pre-processors" (e.g. subtracting the mean) can be attached to each feature object allowing for on-the-fly pre-processing.
 
SHOGUN is implemented in C++ and offers interfaces to C#, Java, Lua, Octave, Perl, Python, R and Ruby.


== Owner ==
== Owner ==
* Name: [[User:besser82| Björn Esser]]
* Name: [[User:besser82|Björn Esser]]
* Email: besser82@fedoraproject.org
* Email: [mailto:besser82@fedoraproject.org besser82@fedoraproject.org]
* Release notes owner: <!--- To be assigned by docs team [[User:FASAccountName| Release notes owner name]] <email address> -->
* Release notes owner: <!--- To be assigned by docs team [[User:FASAccountName| Release notes owner name]] <email address> -->
<!--- UNCOMMENT only for Changes with assigned Shepherd (by FESCo)
* FESCo shepherd: [[User:FASAccountName| Shehperd name]] <email address>
-->


== Current status ==
== Current status ==
* Targeted release: [[Releases/21 | Fedora 21 ]]  
* Targeted release: [[Releases/21 | Fedora 21]]  
* Last updated: 2013-09-23
* Last updated: 2013-12-15
* Tracker bug: <will be assigned by the Wrangler>
* Tracker bug: [https://bugzilla.redhat.com/show_bug.cgi?id=1098143 #1098143]


== Detailed Description ==
== Detailed Description ==
<!-- Expand on the summary, if appropriateA couple sentences suffices to explain the goal, but the more details you can provide the better. -->
* Homepage: [http://shogun-toolbox.org/ The SHOGUN Machine Learning Toolbox]
* SCM-repo: [https://github.com/shogun-toolbox/shogun on GitHub]
* Documentation: [http://shogun-toolbox.org/doc/en/current/ is available here]
* further Information: [http://en.wikipedia.org/wiki/Shogun_%28toolbox%29 on Wikipedia]
 
The machine learning toolbox's focus is on large scale kernel methods and especially on [http://en.wikipedia.org/wiki/Support_vector_machine Support Vector Machines (SVM)].  It provides a generic [http://en.wikipedia.org/wiki/Support_vector_machine SVM object] interfacing to several different [http://en.wikipedia.org/wiki/Support_vector_machine SVM implementations], among them the state of the art [http://en.wikipedia.org/wiki/LIBSVM LibSVM].  Each of the [http://en.wikipedia.org/wiki/Support_vector_machine SVMs] can be combined with a variety of kernelsThe toolbox not only provides efficient implementations of the most common kernels, like the Linear, Polynomial, Gaussian and Sigmoid Kernel but also comes with a number of recent string kernels as e.g. the Locality Improved, Fischer, TOP, Spectrum, Weighted Degree Kernel (with shifts).  For the latter the efficient LINADD optimizations are implemented.  Also SHOGUN offers the freedom of working with custom pre-computed kernels.  One of its key features is the "combined kernel" which can be constructed by a weighted linear combination of a number of sub-kernels, each of which not necessarily working on the same domain. An optimal sub-kernel weighting can be learned using Multiple Kernel Learning.  Currently [http://en.wikipedia.org/wiki/Support_vector_machine SVM] 2-class classification and regression problems can be dealt with.  However SHOGUN also implements a number of linear methods like Linear Discriminant Analysis (LDA), Linear Programming Machine (LPM), (Kernel) Perceptrons and features algorithms to train hidden Markov-models.  The input feature-objects can be dense, sparse or strings and of type int/short/double/char and can be converted into different feature types.  Chains of "pre-processors" (e.g. subtracting the mean) can be attached to each feature object allowing for on-the-fly pre-processing.


== Benefit to Fedora ==
== Benefit to Fedora ==
<!-- What is the benefit to the platform?  If this is a major capability update, what has changed?  If this is a new functionality, what capabilities does it bring? Why will Fedora become a better distribution or project because of this proposal?-->
 
This will bring a Machine Learning Toolkit with a unique selection and versatility to Fedora.


== Scope ==
== Scope ==
<!-- What work do the developers have to accomplish to complete the change in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->
* Proposal owners:
<!-- What work do the feature owners have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->
* Other developers: N/A (not a System Wide Change) <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
<!-- What work do other developers have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->
* Release engineering: N/A (not a System Wide Change)  <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
<!-- Does this feature require coordination with release engineering (e.g. changes to installer image generation or update package delivery)?  Is a mass rebuid required?  If a rel-eng ticket exists, add a link here.  -->


* Policies and guidelines: N/A (not a System Wide Change) <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Proposal owners: Create the rpm-spec and file a review bug.  Have the package build after review was granted.
<!-- Do the packaging guidelines or other documents need to be updated for this feature?  If so, does it need to happen before or after the implementation is done?  If a FPC ticket exists, add a link here. -->
* Other developers: N/A (not a System Wide Change)
* Release engineering: N/A (not a System Wide Change)
* Policies and guidelines: N/A (not a System Wide Change)


== Upgrade/compatibility impact ==
== Upgrade/compatibility impact ==
<!-- What happens to systems that have had a previous versions of Fedora installed and are updated to the version containing this change? Will anything require manual configuration or data migration? Will any existing functionality be no longer supported? -->


<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
N/A (not a System Wide Change)  
N/A (not a System Wide Change)  


== How To Test ==
== How To Test ==
<!-- This does not need to be a full-fledged document. Describe the dimensions of tests that this change implementation is expected to pass when it is done.  If it needs to be tested with different hardware or software configurations, indicate them.  The more specific you can be, the better the community testing can be.
Remember that you are writing this how to for interested testers to use to check out your change implementation - documenting what you do for testing is OK, but it's much better to document what *I* can do to test your change.
A good "how to test" should answer these four questions:
0. What special hardware / data / etc. is needed (if any)?
1. How do I prepare my system to test this change? What packages
need to be installed, config files edited, etc.?
2. What specific actions do I perform to check that the change is
working like it's supposed to?
3. What are the expected results of those actions?
-->


<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
N/A (not a System Wide Change)  
N/A (not a System Wide Change)  


== User Experience ==
== User Experience ==
<!-- If this change proposal is noticeable by its target audience, how will their experiences change as a result?  Describe what they will see or notice. -->
 
<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
N/A (not a System Wide Change)  
N/A (not a System Wide Change)  


== Dependencies ==
== Dependencies ==
<!-- What other packages (RPMs) depend on this package?  Are there changes outside the developers' control on which completion of this change depends?  In other words, completion of another change owned by someone else and might cause you to not be able to finish on time or that you would need to coordinate?  Other upstream projects like the kernel (if this is not a kernel change)? -->


<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
N/A (not a System Wide Change)  
N/A (not a System Wide Change)  


== Contingency Plan ==
== Contingency Plan ==
<!-- If you cannot complete your feature by the final development freeze, what is the backup plan?  This might be as simple as "Revert the shipped configuration".  Or it might not (e.g. rebuilding a number of dependent packages).  If you feature is not completed in time we want to assure others that other parts of Fedora will not be in jeopardy.  -->
 
* Contingency mechanism: (What to do?  Who will do it?) N/A (not a System Wide Change)  <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Contingency mechanism: N/A (not a System Wide Change)  <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
<!-- When is the last time the contingency mechanism can be put in place?  This will typically be the beta freeze. -->
* Contingency deadline: N/A (not a System Wide Change)  <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Contingency deadline: N/A (not a System Wide Change)  <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Blocks release? N/A (not a System Wide Change), Yes/No <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Blocks release? N/A (not a System Wide Change) <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->


== Documentation ==
== Documentation ==
<!-- Is there upstream documentation on this change, or notes you have written yourself?  Link to that material here so other interested developers can get involved. -->


<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
N/A (not a System Wide Change)  
N/A (not a System Wide Change)  


== Release Notes ==
== Release Notes ==
<!-- The Fedora Release Notes inform end-users about what is new in the release.  Examples of past release notes are here: http://docs.fedoraproject.org/release-notes/ -->
<!-- The release notes also help users know how to deal with platform changes such as ABIs/APIs, configuration or data file formats, or upgrade concerns.  If there are any such changes involved in this change, indicate them here.  A link to upstream documentation will often satisfy this need.  This information forms the basis of the release notes edited by the documentation team and shipped with the release.
Release Notes are not required for initial draft of the Change Proposal but has to be completed by the Change Freeze.
-->


[[Category:ChangePageIncomplete]]
see Detailed Description
<!-- When your change proposal page is completed and ready for review and announcement -->
<!-- remove Category:ChangePageIncomplete and change it to Category:ChangeReadyForWrangler -->
<!-- The Wrangler announces the Change to the devel-announce list and changes the category to Category:ChangeAnnounced (no action required) -->
<!-- After review, the Wrangler will move your page to Category:ChangeReadyForFesco... if it still needs more work it will move back to Category:ChangePageIncomplete-->


<!-- Select proper category, default is Self Contained Change -->
[[Category:ChangeAcceptedF21]]
[[Category:SelfContainedChange]]
[[Category:SelfContainedChange]]
<!-- [[Category:SystemWideChange]] -->

Latest revision as of 11:41, 15 May 2014

The Shogun Machine Learning Toolbox

Summary

SHOGUN is a large Scale Machine Learning Toolbox, being implemented in C++ and offering interfaces to C#, Java, Lua, Octave, Perl, Python, R and Ruby.

Owner

Current status

Detailed Description

The machine learning toolbox's focus is on large scale kernel methods and especially on Support Vector Machines (SVM). It provides a generic SVM object interfacing to several different SVM implementations, among them the state of the art LibSVM. Each of the SVMs can be combined with a variety of kernels. The toolbox not only provides efficient implementations of the most common kernels, like the Linear, Polynomial, Gaussian and Sigmoid Kernel but also comes with a number of recent string kernels as e.g. the Locality Improved, Fischer, TOP, Spectrum, Weighted Degree Kernel (with shifts). For the latter the efficient LINADD optimizations are implemented. Also SHOGUN offers the freedom of working with custom pre-computed kernels. One of its key features is the "combined kernel" which can be constructed by a weighted linear combination of a number of sub-kernels, each of which not necessarily working on the same domain. An optimal sub-kernel weighting can be learned using Multiple Kernel Learning. Currently SVM 2-class classification and regression problems can be dealt with. However SHOGUN also implements a number of linear methods like Linear Discriminant Analysis (LDA), Linear Programming Machine (LPM), (Kernel) Perceptrons and features algorithms to train hidden Markov-models. The input feature-objects can be dense, sparse or strings and of type int/short/double/char and can be converted into different feature types. Chains of "pre-processors" (e.g. subtracting the mean) can be attached to each feature object allowing for on-the-fly pre-processing.

Benefit to Fedora

This will bring a Machine Learning Toolkit with a unique selection and versatility to Fedora.

Scope

  • Proposal owners: Create the rpm-spec and file a review bug. Have the package build after review was granted.
  • Other developers: N/A (not a System Wide Change)
  • Release engineering: N/A (not a System Wide Change)
  • Policies and guidelines: N/A (not a System Wide Change)

Upgrade/compatibility impact

N/A (not a System Wide Change)

How To Test

N/A (not a System Wide Change)

User Experience

N/A (not a System Wide Change)

Dependencies

N/A (not a System Wide Change)

Contingency Plan

  • Contingency mechanism: N/A (not a System Wide Change)
  • Contingency deadline: N/A (not a System Wide Change)
  • Blocks release? N/A (not a System Wide Change)

Documentation

N/A (not a System Wide Change)

Release Notes

see Detailed Description