<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	
	>
<channel>
	<title>
	Comments on: Data Preparation and R Packages for Cluster Analysis	</title>
	<atom:link href="https://www.datanovia.com/en/lessons/data-preparation-and-r-packages-for-cluster-analysis/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.datanovia.com/en/lessons/data-preparation-and-r-packages-for-cluster-analysis/</link>
	<description>Data Mining and Statistics for Decision Support</description>
	<lastBuildDate>Sun, 12 May 2019 06:45:25 +0000</lastBuildDate>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.8.2</generator>
	<item>
		<title>
		By: Julian		</title>
		<link>https://www.datanovia.com/en/lessons/data-preparation-and-r-packages-for-cluster-analysis/#comment-1939</link>

		<dc:creator><![CDATA[Julian]]></dc:creator>
		<pubDate>Sun, 12 May 2019 06:45:25 +0000</pubDate>
		<guid isPermaLink="false">https://www.datanovia.com/en/?post_type=dt_lessons&#038;p=7644#comment-1939</guid>

					<description><![CDATA[Dear Dr Kassambara,
as many others have already said - thank you very much for this great site! It is indeed a resource I see myself coming back to again and again.
Regarding data preprocessing, I have been wondering how to deal with skewed data - should some form of power transformation be applied to get them into a more &quot;Gaussian&quot; shape, or are different distance metrics better suited than the Euclidean distance, or does it not matter in the end?]]></description>
			<content:encoded><![CDATA[<p>Dear Dr Kassambara,<br />
as many others have already said &#8211; thank you very much for this great site! It is indeed a resource I see myself coming back to again and again.<br />
Regarding data preprocessing, I have been wondering how to deal with skewed data &#8211; should some form of power transformation be applied to get them into a more &#8220;Gaussian&#8221; shape, or are different distance metrics better suited than the Euclidean distance, or does it not matter in the end?</p>
]]></content:encoded>
		
			</item>
		<item>
		<title>
		By: PS		</title>
		<link>https://www.datanovia.com/en/lessons/data-preparation-and-r-packages-for-cluster-analysis/#comment-1650</link>

		<dc:creator><![CDATA[PS]]></dc:creator>
		<pubDate>Wed, 23 Jan 2019 12:33:24 +0000</pubDate>
		<guid isPermaLink="false">https://www.datanovia.com/en/?post_type=dt_lessons&#038;p=7644#comment-1650</guid>

					<description><![CDATA[Hi, 

I love this site. It really is helping me out in a &quot;Cluster Analysis&quot; project, 

I wanted to understand what kinds of techniques should be used to perform clustering on very large datasets (my data set has about 3 million rows), I am stuck as using functions like &quot;get_clust_tendency&quot; or even the kmeans and hclust algorithms are throwing &quot;cannot allocate vector of 17000 Gb&quot; error. 

Is there a better way to approach this problem with clustering on big datasets?]]></description>
			<content:encoded><![CDATA[<p>Hi, </p>
<p>I love this site. It really is helping me out in a &#8220;Cluster Analysis&#8221; project, </p>
<p>I wanted to understand what kinds of techniques should be used to perform clustering on very large datasets (my data set has about 3 million rows), I am stuck as using functions like &#8220;get_clust_tendency&#8221; or even the kmeans and hclust algorithms are throwing &#8220;cannot allocate vector of 17000 Gb&#8221; error. </p>
<p>Is there a better way to approach this problem with clustering on big datasets?</p>
]]></content:encoded>
		
			</item>
		<item>
		<title>
		By: Alexis Idlette-Wilson		</title>
		<link>https://www.datanovia.com/en/lessons/data-preparation-and-r-packages-for-cluster-analysis/#comment-1549</link>

		<dc:creator><![CDATA[Alexis Idlette-Wilson]]></dc:creator>
		<pubDate>Sun, 02 Dec 2018 01:38:16 +0000</pubDate>
		<guid isPermaLink="false">https://www.datanovia.com/en/?post_type=dt_lessons&#038;p=7644#comment-1549</guid>

					<description><![CDATA[This site is awesome. Thank you!]]></description>
			<content:encoded><![CDATA[<p>This site is awesome. Thank you!</p>
]]></content:encoded>
		
			</item>
	</channel>
</rss>

<!--
Performance optimized by W3 Total Cache. Learn more: https://www.boldgrid.com/w3-total-cache/

Object Caching 107/166 objects using Memcached
Page Caching using Disk: Enhanced 
Lazy Loading (feed)
Database Caching 2/55 queries in 0.036 seconds using APC

Served from: www.datanovia.com @ 2025-07-23 00:24:46 by W3 Total Cache
-->